Font Size: a A A

Web Usage Mining Based Logs

Posted on:2009-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:C J WangFull Text:PDF
GTID:2178360245980101Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet applications, there is a sharply increased demand on information services via the Internet. Too many choices and information people faced can not be digested. Such phenomenon is known as information overload and information lost. How to use effective methods to find the useful information and make it valid is a challenge. Then, the research on Web mining is brought forward and is becoming a hot spot.Data mining based on Web Log is a main aspect of Web mining. How to make the users find the information they are interested in more quickly and expediently is the aim of every Web site. If the site's capability is improved, it will attract more users to visit it. Whether the site can provide the individuation service is an important factor to estimate it. Through data mining on Web log, we can find the user's traversal patterns. It will help us to improve the site's structure and provide the better service to the users.This paper researches how to mine the users' usage patterns based on Web log and researches collaborated recommendation based enquiry log of search engine. The main works are following:1. The Web usage mining was studied completely, including data collection, data preparation, pattern discovery, pattern analysis and applications.2. The thesis presents the basic idea and process of the algorithm of the hard K-means clustering and the fuzzy K-means clustering. It studies the fuzzy K-means clustering parameters, and detailed discusses the clustering question of center initialization. Meanwhile we presents a improved validity function that can be used effectively to find the optimal number of centers.3. The paper proposes an improved Web user and URL clustering method, the algorithm effectively integrated visit time and the number of hits. Experiment by the real server log confirmed the effectiveness of the algorithm.4. This dissertation researches topic-attentive ranking algorithm which used in search engine recommendation. Use query log to analysis keyword clustering and presents an improved similarity function and certificates it by artificial data.Finally, this dissertation summarized the author's works and discussed the future.
Keywords/Search Tags:Web usage mining, Web log, fuzzy clustering, Center initialization, User and URL clustering
PDF Full Text Request
Related items