Font Size: a A A

The Research On Web Usage Mining Based On Frequent Access Pattern Tree

Posted on:2007-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q YanFull Text:PDF
GTID:2178360185465540Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, Web has been become a large information resource,but the conflict between the limited human attention and the unlimited information is notable. Web Usage Mining is a useful method to find user preference and behavior character from Web navigation information. It is important for Web site to operate, manage, carry on e-commerce and attract user.Web mining is a popular research topic which combines various technologies and methods between data mining and WWW. Generally speaking, Web mining includes three research domains: Web Content Mining, Web Structure Mining and Web Usage Mining. The research purpose of Web usage mining is to discover the behavior rule of sites' visitors, to improve the structure of sites and hyperlink structure among pages, to enhance the quality of web services.First of all, this paper introduces the background, the concept and the relation of data mining, Web data mining and Web usage mining, and thus abstracts the necessity and the significance of Web usage mining. Secondly, the paper expounds general method and process of Web usage mining, and points out that the data pretreatment is the foundation, the mining access pattern is the core, and the pattern analysis and demonstration is a goal in Web usage mining.In the procedure of data pretreatment, this paper uses time-and-reference heuristic method to construct session, the method which is time-based heuristic and reference-based heuristic. The method not only uses the time characteristic of session between user session and web sites, but also considers the users' navigating characteristic.In aspect of Web Navigation Prediction,Markov navigation model has been found well suitably to solve this problem. Although the higher-order Markov model has quite good forecast effect, it still has insufficiency, such as: higher state-order and complexity. Based on the analysis of above drawbacks, this paper proposes Frequent Access Pattern Tree algorithm (FAPT). This algorithm includes two steps: access pattern tree method, through the pattern matching method it saves user's visit sequences with tree; pruning method, it uses frequent degree to prune access pattern tree which is under the frequent degree.The result indicates that FAPT algorithm is effective and practicable, and it achieves our anticipation.
Keywords/Search Tags:Data Mining, Web Usage Mining, Frequent Access Pattern Tree, Frequent Degree
PDF Full Text Request
Related items