Research On Web Log Mining Based On Association Rule

Posted on:2010-03-21

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Gao

Full Text:PDF

GTID:2178360278981262

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the popularization of personal computers and the rapid development of Internet technology, more and more people start looking for and gathering resource from the Internet to meet kinds of the requirements.Web server records these behaviors by the form of log. Simultaneously, along with the development of on-line activities and transactions, as well as the advent and application of mass storage device, more and more records of the log are written in the server of the Web. The in-depth study on behavior of browsing web site, the analysis of Web site performance, and the improvement of structure between web topology and page hyperlinks make the better service become possible. Consequently, the technology of Web Log Mining emerges.Based on the Web log of Xi'an University of Science and Technology 50th Anniversary celebration, this paper mainly analyzes researches Web Log Mining from following aspects. Firstly, it introduces the knowledge of Data Mining, Web Mining, and Web Log Mining. Then the content and format of the Web log are researched in detail, and the process of Web Log Mining is given. Secondly, the paper studies data preprocessing technology in Web Log Mining, and analyzes all tasks in every phase of traditional data preprocessing technology detailedly. After that, the paper proposes an algorithm which is based on traditional data preprocessing to simplify the steps in data preprocessing. The algorithm can identify user transaction from user session directly rather than through completing user path.Thirdly, this paper introduces the concepts of association rules, then introduces a classic alogirthm of frequent pattern based on association rules, which is the Apriori Algorithm. Afterwards, the algorithm obtains the frequent itemsets through specific examples. Then the paper proposes an improved algorithm based on Web Topology Structure, The improved Apriori algorithm is proved through specific examples effectively. Finally, the paper introduces a method on how to get association rules through frequent itemsets. And the paper designs and implements a simple prototype system of data mining based on the foregoing chapters. At the same time, the association rules are gotten through the Web logs. At last, the paper analyses the association rules through screenshot of web page. The results show that the mining based on association rules can find out users'browsing habits and improve design of web site.

Keywords/Search Tags:

Web log mining, Association rule, Data preprocessing, Apriori algorithm, Frequent itemsets

PDF Full Text Request

Related items

1	A Frequent Itemsets Mining Algorithm Based On Apriori And FP-TREE
2	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System
3	Research On He Algorithm About Mining Association Rule
4	Association Rules Candidates To Support The Study Of The Frequency
5	Frequent Itemsets Mining Algorithm And Its Application In Data Flow
6	Research Of Association Mining
7	Research And Application On Association Rules Based Bata Mining
8	The Research On Apriori Algorithm Based On USI And Fundamentality Of Item
9	Improvement For Apriori Algorithm Of Association Rule Mining
10	Research On Maximum Frequent Itemsets Based On Improved FP-tree