Font Size: a A A

Incremental Mining Method Research Of Association Rules Of Web Log Based On Clustering Partition

Posted on:2014-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q L HuangFull Text:PDF
GTID:2268330401471912Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The World Wide Web is widely distributed and hugely global information service center, containing a variety of dynamic and usage information, to discover available knowledge from the information and be meaningful to provide personalized service. Web serve log has four characteristics of huge amount, dynamic rich, high complexity and local information. To adapt the data, Web log association rules incremental mining based on clustering partition is proposed, to find potential rules and knowledge from the Web user visit behavior.Firstly, this paper describes a method design of Web log mining association rules based on clustering partition. With the large amount of data arbitrary level divided into k small groups of data, composed of K SOM neural network training set, a algorithm based on Self-organizing feature map neural network called SOM uses conventional optimization strategy to coarse clustering partition of each Web user behavior characteristics; Because each kind of user access behavior characteristics are similar, the mining algorithm based on FP-growth in this data sub sets, can effective use of FP-growth advantage without generation of candidate item sets, and can reduce the branch of the FP tree, the number of conditions in the FP tree. However, the user visit information of each sub cluster does not imply that the type of users is not interested in pages which are not visited. So frequent item sets of each users behavior sub cluster is as the local frequent item sets. By testing them again is to get global frequent item sets, and tap the potential knowledge and rules of Web user access behavior.Secondly, this paper designs a method of incremental mining Web log association rules based on clustering partition. It is improved based on the front proposed algorithm, which is used to get frequent item sets of new data in new dynamic information rich Web log, then uses old frequent item sets and new frequent item sets, to update the frequent item sets by the nature of the frequent item sets. The algorithm greatly reduces the times of scanning database, without generation of candidate item sets, effectively reduces the depth and width of FP tree. Especially for the database in similar to the Web log having large amount of data and dynamic abundant information, it has more advantage.Finally, the paper uses C#.net technology to design and achieve the associational model of Web user visit behavior, experimentally analyze Web server log data after pretreatment, and test and evaluate the performance of the model. Results prove that the algorithm can availably deal with Web log mining of large amount and dynamic richly, improve the accuracy and adaptability of Web log association rules incremental mining.
Keywords/Search Tags:associational rules, incremental mining, Web log, self-organizing neuralnetwork
PDF Full Text Request
Related items