Font Size: a A A

Research On Web Log Mining Based On Association Rule

Posted on:2010-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:X Q GaoFull Text:PDF
GTID:2178360278981262Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularization of personal computers and the rapid development of Internet technology, more and more people start looking for and gathering resource from the Internet to meet kinds of the requirements.Web server records these behaviors by the form of log. Simultaneously, along with the development of on-line activities and transactions, as well as the advent and application of mass storage device, more and more records of the log are written in the server of the Web. The in-depth study on behavior of browsing web site, the analysis of Web site performance, and the improvement of structure between web topology and page hyperlinks make the better service become possible. Consequently, the technology of Web Log Mining emerges.Based on the Web log of Xi'an University of Science and Technology 50th Anniversary celebration, this paper mainly analyzes researches Web Log Mining from following aspects. Firstly, it introduces the knowledge of Data Mining, Web Mining, and Web Log Mining. Then the content and format of the Web log are researched in detail, and the process of Web Log Mining is given. Secondly, the paper studies data preprocessing technology in Web Log Mining, and analyzes all tasks in every phase of traditional data preprocessing technology detailedly. After that, the paper proposes an algorithm which is based on traditional data preprocessing to simplify the steps in data preprocessing. The algorithm can identify user transaction from user session directly rather than through completing user path.Thirdly, this paper introduces the concepts of association rules, then introduces a classic alogirthm of frequent pattern based on association rules, which is the Apriori Algorithm. Afterwards, the algorithm obtains the frequent itemsets through specific examples. Then the paper proposes an improved algorithm based on Web Topology Structure, The improved Apriori algorithm is proved through specific examples effectively. Finally, the paper introduces a method on how to get association rules through frequent itemsets. And the paper designs and implements a simple prototype system of data mining based on the foregoing chapters. At the same time, the association rules are gotten through the Web logs. At last, the paper analyses the association rules through screenshot of web page. The results show that the mining based on association rules can find out users'browsing habits and improve design of web site.
Keywords/Search Tags:Web log mining, Association rule, Data preprocessing, Apriori algorithm, Frequent itemsets
PDF Full Text Request
Related items