Font Size: a A A

Research On Web Text Filtering Method

Posted on:2008-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:W W LuFull Text:PDF
GTID:2178360272468315Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The amount of information on the Internet is increasing quickly with the development of Internet. Meanwhile, the style of content in the internet which is showed to the man is changed variably and of course there are some bad and insecurity things emerged in the face of the Internet. How can we get the information which we really needed as soon as possible? How to avoid unsafe and unhealthy information online? Information filtering can provide a solutionMost of the Internet information is appeared as Web text. There are some major technologies of the Web Text filtering which are Chinese word segmentation, feature extraction, text classification, and other related technologies. There are three steps in the Chinese filtering. First, put the entire text of Chinese word segmentation; second, convert the text into the form of vectors according feature items; finally, classify the text through a mathematical model. Filtering is essentially to replace the man to select the category of the text which the man will read before the man read so as to enhance the gaining information efficiency.The Web text information filtering model based on the multi-layer filtering method imitates the method which the human filter text information. People browse the paper or books at first when they read paper or books, and then they select the article according the title to read. The model expresses the text into two levels which are title layer and layer. In the course of filtering, the filters firstly filter the title content through keyword filter. If the title isn't filtered submit it directly to the user to read; otherwise, the text will be expressed into vector and be classified by the neural network. Neural network classifier first trains the network through improved BP learning algorithm so as to adjust the parameters of the network and form the more efficient neural networks. When the new text which will be classified arrived, the trained neural network can directly classify the text vector.
Keywords/Search Tags:Filtering Method, Text Filtering, Vector Space Model, BP Neural Networks, Feature Extraction
PDF Full Text Request
Related items