Font Size: a A A

Research Of Web Usage Mining

Posted on:2006-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:H X LiuFull Text:PDF
GTID:2168360155968832Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an important medium for business and information transmitting, Internet has acquired rapid progress. At the same time, information in the Internet are increased with fast speed. Now Web turns into one of important means for accessing information resource. However, magnitude web documents not only provide us with abundance information, but also propose a challenge for us to how to search required resource.Building a web station to public the same information for everyone is a representative service pattern for web system. It is a pity that everyone has different interest so as to have different demand. Therefore, how to establish personal service, how to improve web system to advance service quality and how to mend web station's design to abstract more users are hot research domain. This article contributes three works to above problem by researching in web usage mining.At first, Based on previous work, this article proposes a novel web document recommendation model in personal service. It integrates information retrieval, information filtering and data mining. It applies two kinds of feature vectors to represent user's interest and extend single user interest based on group interest. We expound how to set up and maintain profile and implement personal recommendation.Second, this article explains how single Markov model, single and multi stage N-gram model and cluster/classify were be applied to predict user surfing path and points out their flaw. Consequently, the article proposes hidden Markov model-based algorithm to advance predict precision through mining content of web document read by user.At last, By analyzing and studying most of the most currentalgorithms about mining association rules, we find that generated association rules are quite redundant and many rules. Therefore, based on the previous work, this article proposes CF measure and put it as threshold to mine valuable rules. At the same time, this article establishes a theory architecture to accept or reject or reserve synchronously a pair of rules by analyzing CF gene, and introduces it to the algorithms.
Keywords/Search Tags:Web usage mining, Personal service, Markov model, Association rule
PDF Full Text Request
Related items