Font Size: a A A

Research And Application Of Web Log Mining Based On Association Rule Of Cluster-partitioning

Posted on:2015-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:M ShiFull Text:PDF
GTID:2298330452950779Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid growth and proliferation of e-commerce, Web services, andWeb-based information systems, the volumes of Web log data collected by Web-basedorganizations in their daily operations has reached astronomical proportions.Analyzing and finding such potential regular pattern and knowledge has great impactas for the applications like user access,advertising and personalization service. Thisthesis is aimed to find the potential regular pattern and knowledge in thesemi-structured data in Web log dataset.A improved method that based-oncluster-partitioning is proposed.This method’s basis is on the association-rules miningpattern in Web log dataset.Web log mining refers to the automatic discovery and analysis of patterns inclickstream and associated data collected or generated as a result of user interactionswith Web resources on one or more Web sites. The goal is to capture, model, andanalyze the behavioral patterns and profiles of users interacting with a Web site. Thediscovered patterns are usually represented as collections of pages, objects, orresources that are frequently accessed by groups of users with common needs orinterests.The types and levels of analysis, performed on the integrated usage data,depend on the ultimate goals of the analyst and the desired outcomes.In this sectionwe describe some of the most common types of pattern discovery and analysistechniques employed in the Web usage mining domain and discuss some of theapplications.This thesis covers these contents:(1) The thesis introduces the Web log mining background and the currentenvironment at home and abroad.And I summarize the current developed algorithmsand proposed some improved directions about these algorithms;(2) The thesis introduces some steps and methods of data preprocessing to theproperties of semi-structured and redundancy of original Web log dataset.After thepreprocessing, we proposed some modeling method on the current data that can makethe mining step in a good order, meanwhile, can guarantee the mining quality;(3) After we give a elaborate exploration and analysis about the key algorithms inWeb log mining,at the same time,we give some choices about the improving methodto the given algorithms’drawbacks;(4) A apriori algorithm based-on cluster-partitioning is proposed and we have a simulation experiment about the improved algorithm. Through the comparison ofexperiment results, we prove that proposed algorithm has the improvement ofperformance to original algorithm and finally give the prototype design.
Keywords/Search Tags:Web log mining, data modeling, cluster-partitioning, Apriori algorithm
PDF Full Text Request
Related items