Font Size: a A A

Web Usage Mining And The Research Of Personalized Recommendation

Posted on:2012-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2218330368998902Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining technology is an important topic in computer science, artificial intelligence and database research; it is a process that extracting the potentially useful information and knowledge which people does not know in advance from a large number of, incomplete, noisy, ambiguous, the practical application of random data. Web page contains complex, unstructured, dynamic data information, how to analyze vast amounts of information on the Web and provide personalized recommendation service for the user's needs, is an important applications of data mining. Based on the previous researches, we explore the Web usage mining in this paper, the main content of which can be summarized as follows:(1) A general research on the basic theory and classification of data mining is done,and the data source of Web usage mining and the basic process of data preprocessing are analysed primarily.(2) Introduced the theory of association rules primarily, and analyzed the performance of the classic Apriori algorithm, then an improved algorithm is proposed. The new algorithm adds a pruning process before the natural connection, and reduces the number of item sets that participate in the connection,therefore, the number of frequent itemset and the size of candidate itemsets generated are reduced. At the same time, it reduces the number of loop iterations and run time, the unnecessary judgment times on the step of. connection judgment.(3) A detailed description of the K-means clustering algorithm basic idea and process is done, analyzed its advantages and disadvantages, proposed an improved K-means algorithm, that is, the MFA algorithm. Aiming at the problem of K-means algorithm that each of the adjustments in the cluster center to determine the new cluster center requires a lot of distance calculation, proposed a new method by means of the information of cluster displacements to determine the new center, We reduce the computational complexity of filtering algorithm according to selecting candidates from the set of active cluster centers.(4) Analysis and preprocess log data of our campus network, then use the improved Algorithm for data mining to find user's access patterns, at the end, we make use of it to add personalized recommendation feature for the web site, take the initiative to recommend their potential interest information for users.
Keywords/Search Tags:Data Mining, Web Usage Mining, Personalized Recommendation, Apriori Algorithm, K-means Algorithm
PDF Full Text Request
Related items