Font Size: a A A

Research Of Multi-Markov Chains Prediction Model Based On Web Users Clustering

Posted on:2014-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:F L ZhengFull Text:PDF
GTID:2248330398451074Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet, WWW is playing an important role in ourdaily life. A lot of business opportunities are lurked in these network data growingexponentially. At the same time, the research of the use of Web becomes a moreand more popular area to be studied.How to find users’ interesting in lots of Web log data is very important toimprove personalized service. Therefore, both Web user clustering and Webprediction are paid great attentions. Accurate and efficient Web user clustering andWeb prediction are very useful to adjust Website structure, to improve systemperformance and supply personality service. So the competitiveness of enterpriseswill be enhanced and emerging E-commerce will be more prosperity.Firstly, the advantages and disadvantages of the traditional Leader algorithmand κ-means algorithm are analyzed in the paper. The improved κ-meansalgorithm, κ-means, is proposed based on Leader algorithm. Marginalizationand randomicity are avoided about determining the initial cluster centers. Furthermore, a web session pattern clustering algorithm, RDPLK κ-means,based on users’ characteristic is proposed. Based on LKκ-meansalgorithm, timeduration on web page and visited frequency are both considered to reflect theusers’ interesting in the algorithm. Because of the greater influence on timeduration by users’ personality, time duration is characterized as RDP time durationto reduce its personality. Then, a new similarity metric between two patterns isproposed. Experiments show that this algorithm is effective to cluster web users’sessions. Cluster results are objective and reasonable.Finally, based on the results of RDPLK κ-meansalgorithm, browsingcharacteristic of various clustering users is described by different Markov chainsrespectively. Because of same or similar surfing behavior of the users in the sameclass, the state about its Markov chains is intensive and numbered. Compared withtraditional Markov chain, multi-Markov chains prediction model based on Webusers clustering is more effective. Its time and space complexity is less, and itsprediction is more accurate and higher coverage.
Keywords/Search Tags:Web log, κ-meansalgorithm, Web users clustering, Markov chain, Web prediction
PDF Full Text Request
Related items