Font Size: a A A

Research On Web Users Clustering Based On Web Log Mining

Posted on:2010-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:X L GuoFull Text:PDF
GTID:2178360278466407Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the rapid growth and wide popularization of Internet, the contradiction between rapid growth of the information and the people's limited attention is unceasingly increasing, but web log mining is an effcetive means to solve it. Behavior information of the web users are concealed in the web log, the web log mining can find characteristics and rules of the users' visiting behavior, then we analyse the characteristics - and rules to identify potential customers of web site and to improve the service quality to users. Clustering is one of technologies of data mining which applied by web log mining. Applying clustering to analyse users' visiting behavior can realize automatical classification of users according to their interests, thus it will help us to improve the web site's structure, to recommend personalized service and to promote e-commerce.The works of the dissertation is as follow:(1) Presenting a new clustering algorithm(CAS-C) based on the chaotic ant swarm algorithm, and applying it to cluster web users. The algorithm regards clustering problem as a function optimization problem, which aims obtain global optimal clustering centers by mining the objective function. In the process of the algorithm computing, finding clustering centers can be achieved by chaotic searching and self-organization optimization that two processes. Comparing to the classical k-means algorithm for clustering, the CAS-C algorithm has three advantages: not sensitive to initial clustering centers, finding global optimal solution to the objective function; not sensitive to clusters with different size and density; suitable to multi-dimensional data sets. (2) According to the general process of Web log mining and using server logs of actual web site-www.ctaxnews.com as experiment data, we achieved the clustering of users of the web site and analyzed the clustering result. We preprocessed the data of server logs to abtain users' visiting pattern, then we made use of CAS-C algorithm to clustering. By analysing the clustering result, we found useful characteristics and rules of users' visiting behavior, they can provide referrence and basis to many applications such as personalized service and e-commerce etc.
Keywords/Search Tags:Web log mining, Data preprocessing, Clustering, CAS-C algorithm
PDF Full Text Request
Related items