Font Size: a A A

Design And Implementation Of A Personalized Intelligent News Retrieval System

Posted on:2009-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:T YangFull Text:PDF
GTID:2178360245481261Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the available information increases, the inability to process, assimilate and use such large amount of information becomes more and more apparent. The problems of "the overloading information" and "the confusing resources" cause people to feel helpless.Currently available web news retrieval systems face a number of problems in that web-based news retrieval requires the ability to quickly and accurately process and update very large amounts of data that is constantly being updated.It is very difficult to satisfy the user from different background, different intention by general search engine. In order to solve these problems, the personalized search engine appears.In this thesis, we present the design and implementation of Ai-Times, a personalized intelligent search engine the goal of which is to accurately retrieve and organize the web news information for different users according to their interest. Ai-Times can return the search results according to the user interest. This version of Ai-Times introduces the following novel algorithms: A novel optimized crawler algorithm whose fetching-speed is 6 times faster than that of the traditional crawler; A keen tag based extraction algorithm which can extract the data rich content with minimal manual effort and which also allows data to be classified as important or not important so that the crawler can revisit and update important data; A modified vector space model improved using query expansion. A redundancy information recommendation algorithm is presented. Simulation tests proved that such architecture and algorithm could search relevant information for users effectively according to users' interest and have and have superior adaptability...
Keywords/Search Tags:Personalization, Vector space model, redundancy information recommendation, User interest, Agent
PDF Full Text Request
Related items