Font Size: a A A

Research Of A Type Of Web Log Mining’s Clustering Algorithm

Posted on:2013-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2248330395955355Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the coming of information age and rapid developing of electronic commerce,human’s behavior of electronic commerce become more and more frequently, datamining has turn into Web Data Mining, the born of Web Data Mining signed the hugechange of human business. On the one hand,Web service providers are also constantlytrying to get the interests and hobbies of the users in order to provide them with moretargeted services. On the other hand, it is more and more concerned about howquickly and efficiently to find potentially valuable information from a range of thenetwork information. However, Web pages are far more complex than the textdocument because Web is dynamic and unstructured. Web log mining combines thetraditional data mining techniques with Web technologies to carry out excavation andanalysis on the server log, and discover association rules from the vast amounts ofinformation data to address the various issues raised above。Web Data Mining is thecombination of data mining technology and application of internet research, and it hasbecome the focus of the field of data mining, Web application Mining also called Weblog Mining, it is a very important aspect of Web Data Mining, there were so manysorts of classic algorithms, one of the classic clustering algorithm is the Hammingdistance algorithm, while the algorithm has achieved some success, but there are alsosome shortcomings.In this thesis, the sort and method of Data Mining and Web Data Mining wereintroduced, then analysis the fault of the algorithm which for the Hamming distancealgorithm, and ameliorate the traditional algorithm of Hamming distance, quote thebipartite graph into the procedure of the clustering algorithm. With this improvement,the veracity of the clustering algorithm has been enhanced, secondly, during thetransfer of the data, the data of the database has been optimized, as a result, theprocedure can save so much time from the repeating of calculating and transfer. Fromthe experiment and the analysis of the outcome, finally, the conclusion show that theHamming distance algorithm which has been ameliorated is feasible and effective.
Keywords/Search Tags:Web Application Mining, Hamming Distance Clustering, Bipartite Graph
PDF Full Text Request
Related items