Font Size: a A A

Network Information Filtering Model Based On Genetic Algorithm Research

Posted on:2007-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y G LiuFull Text:PDF
GTID:2208360182997257Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of NII based on the Internet, information technology has beenwalked into every aspect in our life. The information in the Internet rises in exponential way. Theincrease of information has two effects, one is it makes people get abundant and recent informationeasily, the other is some information such as eroticism,violence,evil religion etc is harmful to ourheart. As a result, how to filer the information irrelevant to our demand and pick up the informationwe need is getting the hotspot in the research of the Internet.This paper mainly studies the network information filtering, the contents include every phaseof information filtering, the following questions are the research keystone around the two aspects,precision and recall.1,Analyses research background and current status of the network information filtering, clarifiesthe relation betwen information filtering and the information retrieval, points out the evaluationindicators of the information filtering effect.Firstly, analyses research background and current status of the network information filtering,points out that algorithms and feature extraction are the keystone. Algorithms stududing has twotrends,one is that traditional matching method of the keywords are being replaced by the naturallanguage. The other is that profile's manual input is being replaced by machine learning.Secondly,points out tha the relation betwen information filtering and information retrieval,Lastly,studies the two indicators of information filtering effects, points out the relation betwen precisionand recall.2,Analyes and compares the key technology in network information filtering.Studies the Chinese word automatic segmentation ,introduces the Word segmentingalgorithms,finds out the word segmenting algorithms based on dictionary was better than which hasnot.studies the feature extraction methods: DF, IG, CHI,TS, MI. finds that the extraction efficiencyof them is: CHI>MI>DF>TS>IG, uses many of them logically to gain the better result. Analyzes theinfromation filtering model: boolean logical model, space vector model, probabilistic inferentialmodel, clarifies the advantages and disadvantages of them.It also studies the text classificationalgorithm: Rocchio method, KNN, Na?ve Bayesian Method. Through experiments, it discoveresthat the Rocchio classification methods has better results in the use of the vector space model.3,Brings forward the network information filtering model based on the genetic algorithms.This paper compares the traditional network information filtering model with the networkinformation filter model based on the genetic algorithms.The innovations as follows,firstly, bringsforward the the emendation algorithms of profile .Secondly, brings forward a new matchingalgorithm between the profile and the file waiting to be filtered.Thirdly, brings forward the elevancefeedback algorithm of modifying the profile. Lastly, simulates the four network informationfiltering models: boolean logical model, space vector model, probabilistic inferential model and thenetwork information filtering model based on genetic algorithms, finds that the network informationfilter model based on the genetic algorithms is better than the others.4,Designs and implements the information filtering system based on genetic algorithms.The system is primarily composed of five modules, profile's generation module, profile'sreconstruction module,packet capturing module, network protocol analysis module and filteringmodule. Performes the optimization from three aspects, properly reducing the length of theclassification feature vector, reducing the number of the raw packets, reducing the length ofinformation. Network information filtering system as a separate monitoring nodes or as a part of thegateway software has good performance. The system adopts three filtering mechanisms, corefiltering, specific information domain filtering, text feature vector filtering. After testing, thissystem runs reliably, steadily, effectively.
Keywords/Search Tags:network information filtering, information filtering model, genetic algorithm, relevance feedback, profile
PDF Full Text Request
Related items