Font Size: a A A

Research On Network Information Filtering Model Based On Genetic Taboo Algorithm

Posted on:2012-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:P P JiangFull Text:PDF
GTID:2178330332989990Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development and application of internet, the network information is rapidly increasing, rich in content and various in form. However, coin has two sides, while enjoying the convenience of the internet, we also have to face some negative information. In addition, because the internet is open, dynamic and isomerous, it is rather hard to get information what we need, how to automatically extract the information to meet the personalized demands of the user from dynamic information flow becomes more important than ever. In order to solve above problems, network information filter technology has emerged as required. Network information filter can extract information what the user needs and shield the negative information, it focuses primarily on the research about the acquirement and representation of information, the establishment of user template, and the text classification.This thesis covers each stage of the network information filter and makes research and study on the following aspects with the two main indexes of filter accuracy and speed of information filter model:1. This thesis deeply researches on the related filter model of network information filter and its'key technologiesThis paper discusses the typical information filter model and related algorithms at first. Then, it mainly researches on key technologies which used in network information filter, such as the acquirement of network data, the word segmentation technology, feature selection algorithm, the calculation of the feature weights, text representing model, classification algorithm and so on.2. This thesis proposes the network information filter model based on genetic taboo algorithmThis paper makes an in-depth discussion of the basic principle and application of the genetic algorithm, based on the analysis of the advantages of genetic algorithm, due to the drawback that the genetic algorithms is poor in capable of climbing and has premature problem, this paper introduces taboo search algorithm with strong capable of climbing mountains in crossover operator, which forming taboo crossover operator to improve the search capacity of traditional genetic algorithm. In the classification stage of filtering model, due to the problem that the traditional Naive Bayesian Classifier used in the model could not solve the problem of single category words, this paper improves the classification to make it have better robustness and adaptability.3. This thesis proposes text summarization method applying vocabulary combination into sentence extractionA text contains many sentences, but some sentences can not express the theme of this text, these redundancy sentences have impact on the quality of user template. Text summarization as an information compression tool can compress text content, remove redundant sentences, and extract the most refined content. In order to improve the quality of the template, this paper introduces text summarization to optimize corpus. In the process of extracting, due to a phenomenon that the lexical analysis system what it uses has the low segmentation accuracy and causes semantic loss between features, this paper formulates the amendment rules, which are used to the sentences formed after partition process of the words, to regulate the vocabulary combination according to the part of speech, making the words in the same sentence semantically related to each other can establish their appropriate links. The summary method proposed in this paper makes the contents extracted more refined and accurate.4. This paper designs and implies a network information filter model based on genetic taboo algorithmIn the system, we firstly adopt the improved text summarization method to preprocess the training corpus, then use the improved genetic algorithm to training text, and form the best user template, finally categorize text by using the improved classification algorithm and achieve a multi-hierarch, multi-policy and modular network information filter system based on genetic taboo algorithm. After testing, this system runs reliably, steadily, effectively, which can effectively filter.
Keywords/Search Tags:network information filter, genetic taboo algorithm, Navie Bayesian, lexical analysis, text summarization
PDF Full Text Request
Related items