Font Size: a A A

Research And Implementation Of RSS Content Filtering Algorithm

Posted on:2009-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:N YaoFull Text:PDF
GTID:2178360245469993Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, a good many of new network business appears. With the development of the business, the way of our lives and work are changing. People's participatation makes the network becomes an important way to obtain information, knowledge and opportunities. Then RSS becomes the most important type of information push carrier. Against the increasingly widely used RSS service, this paper proposed a content filtering model.This paper introduces the structure and standard of RSS and text clustering and classification algorithm and put forward a new RSS content filtering model.(1) This model constructs a minimum core content set to meet the Subscribers' need. Considering the RSS contents' importance, we filter the RSS documents and propose three-tier model including the core layer, the middle layer and the outer layer. The core layer is definitely subscribers' interest, while the other is their second choise.(2) In this model, the algorithm of clustering and classification are integrated effectively. We hierarchically clustered document contents that recieve RSS source for the first time to build content filtering model and use classification algorithm for the following received contents to meet subscribers demand.(3) In filtering mechanism, we combine the way of keywords and VSM model and use RSS file attributes, such as title, time of publication source, author to screen the contents of RSS document effectively. (4) This model considers the special Chinese nature and the adding and deleting of key words. Its forward and reverse attributes make it more flexible for updating dictionary. By feedback, the model's content filtering effectiveness is increased.Finally, we do some experiment to test the model and validate the algorithm.
Keywords/Search Tags:RSS, content filtering, minimum content set, hierarchical clustering
PDF Full Text Request
Related items