Font Size: a A A

Research Of The Technologies For Content-based Chinese Text Filtering

Posted on:2009-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:X W LiFull Text:PDF
GTID:2178360245953596Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Network technology,Information processing becomes the essential method for people to get useful information. Text information flitering is an important research field of Chinese information processing. According to users'information needs, information filtering uses certain tools to select the information automatically which can satisfy the users'needs, and filter out useless and illegal information from large-scale dynamic information flow.In the broad sense, information filtering includes the processing of text, audio, image, video and other forms of information. In a narrow sense, information filtering is just referred to text information filtering processing. Since the main information carrier format on the Internet is text, text information filtering technology has firstly become the focus of research. There are two kinds of methods in information filtering research: they are Content-based Filtering and Collaborative Filtering. In this paper, we mainly focus on Content-based Chinese text information filtering.Using Vector Space Model to express the Filtering System based on key words is simply and easy to carry out. But VSM cannot solve problems of semantic aspect, which affects the filtering results. In view of this question, we introduce the concept factor, using synonymous dictionary to expand user profile in order to solve the synonym problem.During the filtering process, the users'needs may change. Sometimes, constructing a good user profile maybe have a good performance, but it is just a rough and approximate expression. Therefore, when users request a high precision, machine learning is needed. This paper makes use of users'feedback information, and adopts an adaptive method to modify user profile actively in order to improve the precision of filtering system.Moreover, we also take advantage of the strongpoint of other text filtering systems, and concern the system performance measurement metrics, such as precision, recall, and realizability. Then an improved Chinese Text Filtering System is given, which is realized by Java language, the experiment results prove that this system achieves a certain degree of filtering effect.
Keywords/Search Tags:User profile, Vector Space Model, Adaptive text filtering, Concept expansion
PDF Full Text Request
Related items