Font Size: a A A

Research On Web Data Mining Based On The Analysis Of The Campus Network Log

Posted on:2012-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:R R ShangFull Text:PDF
GTID:2248330395955695Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet, the sheer volume of information availableon the Internet is overwhelming. This phenomenon is referred as information overload.The information diversity makes it even harder for users to find the desiredinformation. Users are lack of effective ways to find relevant information and get losteasily, namely information bewilderment. Now, we primarily use search engines forinformation retrieval. Most search engines perform passive searching and regardless ofthe preference or specific interests of different users. Therefore, search engines cannotsolve the information overload and information bewilderment problems effectively.With the rapid development of the Internet, the amount of information increases atan exponential rate, how to find potential and interesting knowledge from enormousdata is a very important and meaningful issue. Web data mining is a hot research issuewhich combines various technologies and methods between data mining and WWW. Ingeneral, Web mining includes web content mining, structure mining and usage mining.This paper describes the whole process from data mining to Web data mining andWeb log mining systematically, as well as the technology of the preprocessing of Webdata mining, web log pattern discovery and web log pattern analysis. The main point ofthe paper is at the improvement of the preprocessing of the web data mining, and thedetailed experiment scheme is propsed based on the theoretical improvement, and theweb data mining system based on the campus log analysis is designed.The main content of this paper is as follows: Firstly, we present the significanceand background of the paper and the current research situation in home and abroad;then we summarize data mining, web data mining and web log mining, show therelationship between them. Secondly, we study data preprocessing technology in weblog mining, and analyze all tasks in every phase of traditional data preprocessingtechnology detailedly; then an algorithm based Frame page filration, which is based ontraditional data preprocessing to simplify the steps in data preprocessing. Experimentindicates that the algorithm can improve the speed without lowing the accuracy ofpreprocessing. Finally, several algorithms such as Apriori algorithm in associationrules used typically in data mining were proposed, and its’ results is compared withother general method. In the next part of the paper, the procedures of web log miningand some instances were introduces, then the reviews of our work and the conclusionswere drawn at the last part of the paper.
Keywords/Search Tags:Web data mining, Web log mining, data preprocessing, personnalrecommendation
PDF Full Text Request
Related items