Font Size: a A A

Research Of Personalized Recommendation Model Based On Web Log Mining And Association Rules

Posted on:2015-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z L LiFull Text:PDF
GTID:2268330428480409Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology, the bulk information that internet offers can propel the upgrading of industries, but meanwhile brings some problems, for instance, explosion effect resulting from rapid growth of information can easily cause "information overloading". Web operators continually add new information to Web site to provide more information for Internet user, which makes topological structure become more complicated in Web site. The newly added resources Web site may not meet user’s actual demand and maybe cause "resource misleading". Therefore, how to find the information that people are interested in from amounts of data information is the problem we are faced with. On this occasion, data mining appears in the application of Web site, namely Web mining.Web mining is a kind of comprehensive technology. It involves several fields Web technologies, data mining, informatics, computational linguistics. Web mining can play a role in many aspects such as excavating the structure of search engine, determining authoritative pages, classifying Web document, Web use mining, intelligent query, establishing Metaweb data warehouse etc. Web use mining finds out user’s behavior characteristics and navigation mode in the server log. This paper systematically illustrates data mining system, Web mining and the whole process of Web use mining, mainly studies the following aspects:pretreatment process of Web log, mining model of association rules and recommendation model of sliding window.First of all, pretreatment process of Web log includes:data cleaning, user identification, session identification, path supplement and transaction identification. In processing stage, much irrelevant information can be removed from user’s visit information, meanwhile structured visit information of user on Internet will be also carried out, and it will be stored in a relational database in the form of a transaction or session.Then, this paper uses the weighted association rules to data mining after pretreatment.The typical Apriori algorithm of association rules mining can not only find how pages are related to each other in Web site, but also can play an important part to find that user has a preference for navigation model. However, Apriori algorithm applied to Web log mining also has its subjective limitation. Implied assumption of Apriori algorithm is that all pages have the same importance, but it does not take differences between pages into account. As a result, rules that are digged out may be missing some pages that users are interested in.This paper introduces the concept of "page weight" due to the weakness of Apriori algorithm in the application of Web log mining. It reflects users’real preferences of the page. According to its definition, the author takes browsing time and access frequency into consideration. Based on this, he also proposes W-Apriori algorithm. This algorithm adopts the way of extending boolean matrix to describe transaction database, which is beneficial for the compression of transactional database. At the same time, introduction of weights is also in favor of distinguishing the differences between pages and effectively solves the problems of missing some important pages in the process of mining.Finally, this article takes advantages of rules getting from mining to form a rule mining library, uses sliding window technology, proposes recommendation model of Web log based on association rules mining. The model not only effectively solves the problems of "information overloading" and "resource misleading", but also recommends relevant Web user the pages that user is really interested in to realize personalized recommendation.
Keywords/Search Tags:Data Mining, Web Log Data Mining Association Rule Frequent AccessPatterns Recommender System
PDF Full Text Request
Related items