Font Size: a A A

Reseach On Related Algorithms Used In Web Log Mining System

Posted on:2006-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:S H JinFull Text:PDF
GTID:2168360155955224Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, the network is becoming an effective platform for people to exchange and process information, and digital information increases daily with high speed. There is a mass of information on Internet, how to look for information that someone wants in Internet is becoming a difficult problem, which is so-called "rich data poor information".In order to solve the problem mentioned above, Web data mining emerges as the times require. Thereinto, the Web log mining technology is paid more attentions by numerous researchers especially. By utilizing Web log mining, we can get the browsing mode of the customer.In the process of getting the browsing mode of the customer, Web log preprocessing is the chief problem to be solved. But, because the traditional process of web log preprocessing does not eliminate the influence of frame page, the interesting of mode is low. Thus, the author puts forward an algorithm of page filter in this paper, and applies it to the phase of Web log preprocessing.After data preprocessing, we can select one of data mining techniques, such as clusting, classifying, or association rule etc., according to concrete requirements. Our aim is to find the similar user group according to browsing behaviors, and to find related page group according to the Web pages visited by the user. In this thesis, the clusting is selected as our data mining technology. Firstly, this thesis introduces briefly clustering techniques which are existed. After analyzing a typical clustering algorithm in detail, the author finds it has the disadvantage in the complexity of space and time. Therefore, the author brings forward a fast dusting algorithm based on matrix, i.e., Marker Propagation Algorithm. The new algorithm is used to accomplish the rapid dusting of user and page. Finally, some data is used to verify the validity of page filtering algorithm and marker propagation algorithm.
Keywords/Search Tags:Data Mining, Web Log Mining, Data Preprocessing, Frame-page Filter Algorithm, Marker Propagation Algorithm
PDF Full Text Request
Related items