Font Size: a A A

Research On Correlative Algorithms Of Web Usage Mining

Posted on:2009-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:H FengFull Text:PDF
GTID:2178360245489168Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, digital resource on the Internet becomes more and more abundant. Thousand and thousand users browse and search useful information for them on the Internet everyday. But, it's very difficult for users to find useful information in time because of the huge communication on the Internet. To solve this problem, Web mining techniques emerge as the times require. Especially, lots of researchers pay more attentions to the Web usage mining which faces Web server logs. Web logs record the visit information of Web site visitors; Therefore, we can obtain the browsing behavior and visiting habit of the visitors by analyzing the Web logs, which are significant for the page recombination, the structure optimization of Web site, the capability improvement of Web system and the application enhancement of Electronic Commerce.This thesis presents the systematic analysis and research on the Web mining and Web usage mining. Based on the existing reseach, two novel algorithms are improved and proposed in this thesis.The main researches of this thesis are as follows:(1) A general research on the basic theory and classification of Web mining is done, and then the basic idea and classical algorithms of Web usage mining are analysed and researched primarily.(2) On the basis of analyzing Apriori, a classical algorithm of association rules, a novel algorithm for association rule mining is put forward based on transaction matrix. The new algorithm maps the trasaction database into a trasaction matrix, and operates the trasaction matrix to gain all the frequent item sets. The performance ascendant of the algorithm is proven by theory-analysis and experiment. The algorithm can be applied in web usage mining to effectively discover the potential relations among users or pages, and the association between user traversal path and behavior on Internet.(3) Another novel algorithm of user frequent traversal patterns is proposed based on digragh. The algorithm scans the trasaction database only once, records the information of the sequences among the whole web pages in a digragh, and mines all the user frequent traversal patterns based on the digragh. The visit of the web pages could be forecasted by using the mined patterns, and accordingly the advertisement can be placed with reason to suit specifical user group.
Keywords/Search Tags:Web Mining, Web Usage Mining, Association Rules, Transaction matrix, Sequential Patterns, Digraph, User Frequent Traversal Patterns
PDF Full Text Request
Related items