Font Size: a A A

Technologies Research On Web Usage Mining Algorithm

Posted on:2013-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:B XieFull Text:PDF
GTID:2248330371488973Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the computer technology and communication technology rapid development, the worldwide Web has become an indispensable information service center for people studying, working. The world wide increases millions of Web page every day, and Web information itself exists incomplete, noise, fuzzy shortcomings. Users simply taking the matching keyword search method often can’t really get the timely and accurate required information. How to help world wide web services filter out the information which they are interested in from huge information. Making certain the network service for the different user groups becomes the urgent question, so Web data mining becomes the research focus. Among them, based on Web services log Web usage mining technology has become the field domestic and overseas researchers are focused on. Web users’accessing Website information stores in Web services log. Through the analysis of these logs we can find Web user access path model, to guide service provider to plan affordable website, so as to enhance the website commercial value.This paper scientifically analyses the data mining and Web usage mining theory and technology, puting forward the three improved algorithm in the light of the exsiting technology.The paper’s studying contents is divided into the following four aspects:(1) A systematic analysis of the basic theory of the data mining, this paper studies the Web usage mining basic ideas and mining algorithms.(2) In the association rules based on the Apriori algorithm, putting forward the improved Apriori of Web users accessing path noodel found, finding the semantic frequent item sets, experimental results verify that the impnved algorithm enhances the efficiency of the Web data pretreatment affairs identification.(3) In the Web classification, studying classic decision tree algorithm in the classification, italicising the attribute measure choice defects, the advantages and disadvantages ID3C4.5algorithm, putting forward the improved algorithm random forest for the decision tree exsessive fitting, theoretical analysis and experiment results show the superiority of the improved algorithm random forest.(4) In the Web users clustering, this paper introduces some classic measurement method of the object clustering similarity, trying support vector machine (SVM) method for Web users clustering, for the users’information characteristics identification, systemly analysising the object attribute orle and influence at the combination.
Keywords/Search Tags:Web Usage mining, association rules, classification, clustering
PDF Full Text Request
Related items