Font Size: a A A

Research And Application Of Data Mining In Web Usage Pattern

Posted on:2004-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:X GeFull Text:PDF
GTID:2168360092492576Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Two important and active areas of current research are data mining and the World Wide Web.A natural combination of the two areas called Web mining is a new research field.Web Usage Mininig discover and analyse any useful information,extract knowledge from WWW, improve web site designing and provide personalization serve.This thesis includes four parts in which the technologies of Web Usage Mininig are systematically researched.In the first part we summarize the techniques of data mining and Web Usage Mining, present the significance of the research on Web Usage Mininig, the status of research and the problem which Web Usage Mininig will face with.In the second part we discuss the Web Usage Mininig according to the process of Web mining.In the stage of Data preparing and Preprocessing we discuss the algorithm of data cleaning ,user and session identification in detail,and present a data model of Association Rules and Sequential Patterns in the stage of Pattern Discovery,discuss the useful method of Pattern Analysis in last stage.A synthesis clustering algorithm CPPC is proposed in the third part of this thesis. In the preprocessing stage themethod of user and session identification often adopt heuristic algorithm for the being of cache and agent.This induce the uncertainty of data resource.The CPPC algorithm avoid the limitation and has no use for complicated HASH data structure.In this algorithm,by constructing a UserlD-URL revelant matrix similar customer groups are discovered by measuring similarity between column vectors and relevant web pages are obtained by measuring similarity between row vectors;frequent access paths can also be discovered by further processing of the latter.Experiments show the effectiveness of the algorithm.In the fourth part,this thesis bring some key techniques of data mining into Web Usage Mining,combine the characteristic of relation database design and implement a Web Usage Mining system WLGMS with function of visible.lt can provide the user with decision support ,and has good practicability. Finally we give out some future research direction about the Web mining technology according to the present developing status.
Keywords/Search Tags:Data Mining, Web Mining, Web Usage Mining, Clustering, Frequent Access Path, CPPC
PDF Full Text Request
Related items