Web log mining ,the application of one or more data mining techniques to large web data repositories,become an attractive research field today,Web log mining can be instructive to master user's access patterns ,adjust website structure and hyperlinks between pages and improve website service quality.The main contributions of this thesis are summarized blow.1. Because of multi-frame can reduce the interestingness of web log mining results ,the thesis put forward a refined Web log preprocessing technology called frame-filtering. Our experiments show that by filerating subframe page requests that are not directly generated by user clicks, the frame-filteming algorithm can improve the interestingness of web log mining results.2. The thesis put forward a novel clustering algorithm for web log ,which we call CLOPE .By defining a global criterion function on geometrical shapes of clustering histograms ,The experiments show that CLOPE ,is fast and scalable in clustering large and sparse transactional data bases.3. In this thesis ,a real time and individuation recommendation example has been implemented based on CLOPE web log mining.In the end, make a summarize of disadvantage which exists in the thesis,at same time, point out the direction, future and challenge of the web mining.
|