Font Size: a A A

Research And Implementation Of Web Log Mining Based On Asociation Rules Apriori Algorithm

Posted on:2013-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:N YangFull Text:PDF
GTID:2248330377950021Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is gain interesting knowledge and implicit model of the process. Wemay, in according the purpose of data mining to mining and analysis of the knowledgeamong alarge amounts of raw data in the data warehouse and database. With the rapiddevelopment of Internet, data mining applied to a large complexity of the Web data inorder to obtain urgently needed, potential, valuable information, so as to produce Webdata mining. In Web data mining, the Web log mining is a particularly importantresearch subject, Through the Web log mining, we can make full use of many log filesin Web server, find users visit the Web page of the mode and visit habits, which helpse-commerce site administrator optimize network page structure, provide moreconvenient and personalized service.In data mining, how to find meaningful connection or related links from datewarehouse itemsets is a key study. By Agrawal, put forward Apriori for miningalgorithm is one of the most influential mining of frequent itemsets recursivealgorithm. This thesis is focused on researching performance bottleneck of Apriorialgorithm for data mining association rules, propose optimized solution, design novelcarried Apriori algorithm. We then can apply the new method to the new developedWeb log mining application system to achieve the whole discovery of associationrules from Web blog mining.Association rules Apriori algorithm needs to repeatedly scan the transactiondatabase, which will generate the candidate set and large I/O load Therefore, thisalgorithm in time and space is costly to influence the efficiency of the algorithm whenassociation rule produced. In view of the above questions, an optimization method isput forwarded, based on in-depth analysis and study of the Apriori algorithm. Thisnew method can reduce the number of transaction in the transaction database, therebyimproving the efficiency of scanning the transaction database.Web log mining data attained mainly from Web server logs, which contains alarge number of mining information, but at the same time also incomplete and noise.To gain beneficial association rules model, we needs to preprocess these dates. This thesis systematically analyses the development of the Web log mining datapretreatment, and put forward a set of log mining data preprocessing methods toimprove the quality of the data base. We run Web log mining, build a modelassociated based on the application of the improved Apriori association rulesalgorithm and conduct analysis and evaluation to themining model.In this thesis, by using Struts2+Spring2.5+hibernate3.2frame design andimplementation of Web log mining system, showing association rules model on thepage graphically, we provide a simple operation and directly observed of data miningfor the data analyst.The Web log mining platform implemented in this thesis can provide users theentire log mining process. Log mining users can import data set to carry on the dataset pretreatment through the Web page easily, and attain needed association rulemodel by entering the minimum support and minimum confidence. The log mininguser also can work out relative model analysis and model evaluation that help them tosuggest how to optimize the site.
Keywords/Search Tags:Data Mining, Apriori Algorithm, Web Log Mining, Frequent Itemset, Association Rules
PDF Full Text Request
Related items