Font Size: a A A

Research For Information Filtering Technology Based On Text Content

Posted on:2007-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:M DongFull Text:PDF
GTID:2178360212458473Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, has come to being a convenient means by which people get information freely. But at the same time, there are some great negative effects, such as the spread of various kinds of superstition, pornography, violence, reactionary and illegal information, as well as leaking of secret information etc. The traditional filtering technology which based on keywords and IP address filtering can not solve these problems effectively.Research on text categorization and information filtering are being done, Multiple Feature Selection Method is presented. Combining machine learning and information filtering, an adaptive information filtering system is designed and implemented. The main works are as follows.(1)The dissertation gives a brief introduction to information filtering about its history, current research, significance, and the knowledge about information filtering, such as Data Mining, Text Categorization etc.(2)The theory of information filtering is elaborated. Text Categorization is the base of information filtering. The course and some key techniques of text categorization, such as text pretreatment, feature selection, the model of text express, classification algorithm technology are introduced. The main method of information filtering, the structure and model of information filtering system are discussed.(3)Some common means of feature selection have been discussed particularly. With experiments, we compared these means of feature selection, analyzed factors which effect performance of feature selection. According to these, a new mean about feature selection——Multiple Feature Selection Method is brought forward. It has been proved, compared to other means of feature selection, Multiple Feature Selection Method could get better precision.(4)The information filtering system based on Vector Space Model is proposed at the end of this dissertation. This system has adaptive improved the tradition information filter system by modifying the arithmetic of filter template. It can change the threshold adaptive according to user's feedback. The result of experiment has proved that performance of new filter system is improved obviously.
Keywords/Search Tags:Information Filtering, Vector Space Model, Text Categorization, Feature Selection
PDF Full Text Request
Related items