Font Size: a A A

The Research On The Application Of Web Log Mining Based On User Interest And Fuzzy Clustering

Posted on:2016-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:J L XieFull Text:PDF
GTID:2348330512473956Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data today,how to mine valuable information and business opportunities from the Internet explosive information slot has also become one of the hot research fields and the problem that all kinds of decision makers need to solve.Web log mining technology is one of the most effective ways to solve this kind of problems.It combines the Internet technology and the data mining technology and then analyses the unknown and valuable information hidden in the Web.As a result,it draws the useful knowledge.Clustering analysis is one of the most widely used data mining algorithms in the field of Web mining that includes Web user clustering,Page clustering and Session clustering.Web user clustering has the most practical application value that mines user browsing interest and habit.It automatically classifies the same or similar access patterns of users by analyzing the user access footprint to improve the site structure and provide personalized service and the basis for business decisions.Because the Web log data objects are ambiguity and uncertainty,they may belong to a category to a certain extent,but also may belong to another or even more categories to some extent.It is not accurate to only divide the data objects into some category by the traditional hard clustering algorithms.Therefore,this paper uses the fuzzy clustering algorithm in the process of Web user clustering that utilizes the concept of fuzzy sets to make up for the defects of the traditional hard clustering and improve the accuracy of mining results.This paper mainly studies the Web user clustering in Web log mining.Two aspects are improved that are user similarity and clustering algorithm.The improved methods are applied to Web log mining and then it draws Web user clustering result.On one hand,this paper proposes a calculation method of similarity based on user interest.The user interest factors include the user's browsing behavior,click behavior and feedback behavior in the method to extract the user characteristics that can actu,ally reflect the user interest.The method uses the fuzzy multiple sets and maximum minimum value method to calculate the similarity.On the other hand,aiming at solving the shortcomings of fuzzy C-means clustering algorithm(FCM)that is easy to converge to local extremum and exists uncertainty of fuzzy weight,this paper proposes a adaptive fuzzy C-means clustering algorithm based on the shared historical best particle swarm optimization algorithm(SHBPSO-AFCM).The algorithm improves the standard particle swarm algorithm from the position updating formula,velocity updating formula and fitness function.And then it combines the improved particle swarm optimization algorithm with the fuzzy C-means clustering algorithm to achieve the strong global optimization capability.At the same time,the fuzzy weight m is embedded into the particle swarm optimization algorithm that can generate the adaptive optimal fuzzy weight value by iterative update formula of the particle swarm optimization algorithm to solve the shortcomings of the FCM algorithm.The experimental results indicate that this algorithm improves the clustering accuracy and validity.
Keywords/Search Tags:Web log mining, Fuzzy clustering, Particle Swarm Optimization, User interest, Similarity computation
PDF Full Text Request
Related items