Font Size: a A A

Study And Application For Web Log Mining

Posted on:2007-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhangFull Text:PDF
GTID:2178360182497586Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining it is neither one a kind of brand-new information technologies thatappear with development of the data base and artificial intelligence technology inrecent years nor too computer science and technology, especially the subject thatsolve urgently by the development and use put forward of computer network. Datamining is a process that draw modes in data, and let people have the ability to knowthe sterling worth of the data finally, namely information and knowledge in it. Thetechnology of data mining, make enterprise find law that imply of data, offer reliablebasis for making policy in the enterprise. For web have a lot of half structure data, anddata mining must base on the good structure datum foundation. We can say, Even ifget some relevant data on web, use it for mining and analyzing that it is quite difficult.Oriented the web mining is more complicated than oriented warehouse data miningindividual. Traditional database have certain data models, and can descript the dataaccording to model. At the same time have been very good that definition andexplanation correlated query languages. For the wide applications of www andInternet, appear on the basis of different construct data mining by the data of source,such as text mining, time sequence data mining, and the data of electronic businesssystem mining. Follow the development of the technology of the data base,Multimedia data base mining, space data base mining etc have aroused the attentionof a lot of people too. The swift and violent development of Internet, especially the whole worlds of Webpopularizes and Web incomparably abundant amount of information. Through Webmining, we can draw necessary knowledge from Web page: to analyze the contents tototal user receive and visit behavior and frequentness, we can get the generalknowledge of behavior and mode of users, and use that to improve our web serve.And more importance, through the understanding and analyzing of user'scharacteristic, it can help and develop the electronic commercial activities.As a confluence of data mining and WWW technologies,it is possible to performdata mining on web log records collected from the Internet web page access history.Web Usage Mining is the application of Data mining techniques to discover usagepatterns from Web data in order to understand and serve the needs of Web-basedapplications. It is necessary to optimaize the structure of Web sit and to supply theindividuation service. Now Web Usage Mining is hotspot of Data Mining,and it isalso one of the major topics on Web log mining. More meaningful sequence patternsbe found is the final purpose of the thesis.In this thesis, the process of data mining, web data mining and web log mining wasreported. Focusing on the web log mining, the method and technology of web logmining were discussed in this thesis. During data preprocess phase, A visit transactiondivision method based on Maxmal Forward Reference and Time Window model isused. As for patterns mining,this paper proposes Session Matrix and Trace Matrix,designs a fast algorithm for mining user frequent paths On the analysis of Apriorialgorithm and graphic storage organization.
Keywords/Search Tags:data mining, web logs mining, sequential pattern recongnition
PDF Full Text Request
Related items