Font Size: a A A

Study And Application Of Chinese Information Filtering System

Posted on:2007-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:D L LiFull Text:PDF
GTID:2178360212973929Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, the scale of Internet is increasing at a fastest speed. While Internet provides people with much convenience, it also results in some problems such as "Information Overload" and "Information Lost". To overcome these problems, the research of Information Filtering has drawn much attention. Information Filtering (IF) is a task to retrieve the useful or relevant information and eliminate the useless or irrelevant information in a dynamic data stream according to user's request.At first, this paper introduces the background, development history, current research condition and significance of IF technology. And then, it makes a survey of IF and Information Filtering System (IFS).Text is the main form of information on Internet. This paper focuses on the relevant issues of Chinese Text Filtering. Based on the system structure of IFS and model of Text Filtering, this paper gives a logical model of Chinese Text Filtering based on Vector Space Model (VSM).Text feature extraction and representation is the fundamental operation for Chinese Text Filtering. Four phases of document representation are word segmentation, stop-word removal, feature extraction and feature item weighting. This paper provides a new method of feature item weighting based on position weighting algorithm.User Profile is the kernel problem of TF. This paper discusses methods of acquiring knowledge on users, the representation of User Profile, and the profile reformulation strategy using Relevance Feedback.A Chinese document can be partitioned into N-Level text paragraphs. This paper proposes a new Chinese Text Filtering method based on the N-Level Vector Model. As is shown in the experiments, this method has a high performance.Precision is in contradiction with recall in Document Filtering. By importing user's non-relevant profile and non-relevant threshold, this paper designs and implements an improved Document Filtering System based on N-Level Vector Model for improving document filtering performance. The system includes two filtering process. Results from experiments show that this improved document filtering system has a high performance.
Keywords/Search Tags:Information Filtering, Text Filtering, Vector Space Model, Feature Extraction, User Profile, Relevance Feedback
PDF Full Text Request
Related items