The Research And Realization Of Prototype System Based On Web Log Mining

Posted on:2012-07-21

Degree:Master

Type:Thesis

Country:China

Candidate:H D Ren

Full Text:PDF

GTID:2178330335953194

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In an era of Internet information explosion, the users usually acquire information by means of using search engine. However, ignoring the knowledge background and interest of the users, the existing information retrieval system gives the same results to the same query input by the different users, and makes the users into a trek state of information resource. Therefore, this leads to a new research direction for the information retrieval field----the study on personalized information retrieval.The precondition for providing the personalized retrieval is to accurately identify the users and reasonably establish their knowledge and interest background. The Web log contains a lot of user logs. The users'knowledge and interest background can be established through mining the related information to identify the single user and analyzing the users'browsing behaviors to enrich the users'characteristics. Combining with the users'knowledge and interest background, the personalized retrieval system can give the corresponding results to the same query input by the different users to realize the personalized retrieval, enhance the recall ratio and the precision ratio, and improve the user satisfaction.This thesis focuses on establishing the users'knowledge and interest background by means of Web log mining technology and realizing the personalized retrieval prototype system. The main contents are as followings:This thesis mainly discusses the data cleaning technology of Web log data preprocessing stage and gives an introduction on the main several steps of data preprocessing. With regard that the TF/IDF algorithm based on the word frequency ignores the correlation between the user's knowledge and interest and the documents, combining with analyzing the users'browsing behaviors and the users'implicit feedback information in Web log, this thesis proposes the Page Correlation Weight. And considering that the TF calculation ignores the importance of the entry's position in the page, this thesis puts forward the Eiv that is the important factor of the entry. Then, combining with the Page Correlation Weight, the important factor of the entry and the TF/IDF algorithm based on the word frequency, this thesis presents the Partial Weighted TF/IDF Algorithm. Furthermore, this thesis establishes the users'knowledge and interest background, makes use of Rocchio feedback algorithm to update and do real-time analysis on the users'knowledge and interest background, and realizes the personalized retrieval prototype system----Easy Searcher.Finally, the whole thesis is summarized and the prospect on the further development of personalized retrieval is made.

Keywords/Search Tags:

Web Ming, Personalized, Web Log, TF/IDF, Data Preprocessing, Users' Knowledge and Interest Background

PDF Full Text Request

Related items

1	Research On Movies Personalized Recommendation System Based On Users Access Data
2	Research On Personalized Recommendation Based On Web Log Mining
3	Research And Realization Of Personalized Services Of Intranet
4	The Reserach Of Knowledge Management System Based On Personalized Search
5	Research On Personalized Recommendation Method Based On Knowledge Graph Fusing Users' Explicit And Implicit Preferences
6	Study On The Willingness Of Micro-blog Users To Participate Personalized Knowledge Services
7	Research On The Key Issues In Personalized Recommendation Based On SNS
8	The Design And Implementation Of A Personalized Advertisement Recommendation System Based On The Interests Of Weibo Users
9	Research On Expert Users In Personalized Recommendation
10	Research On Personalized Collaborative Filtering Recommendation Algorithm Based On Users' Preference