Font Size: a A A

The Research And Application Of Feedback Mechanism Based On Network Information Filtering

Posted on:2011-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:L W ZhangFull Text:PDF
GTID:2178360308965191Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the increasing expansion of internet information,Network information resources is growing at a rate of exponential. The time of searching the needed information is increasing day after day. In the other way, there are much pornography, crimes and other bad information which impact on the construction of socialist spiritual civilization seriously. So it is very necessary to explore the effective method of information filtering to identify the information meeting the customers' requirements and shield off useless and illegal information from the dynamic flow of information.The common ways of the information filtering are diversiform. Content analysis is the most effective way because of the anti-cheating. The performance of the filtering system is decided by the performance of the category template. The category templates have limitation, because the limited number of training corpus can't cover all of the categories'content. The category will add many new features to make the original classifier obsolete with the passage of time. It may cause errors, omissions and other classification problems, if we still use the original classifier to classify the network texts. Feedback is an effective method to adjust and improve the classification model using the dynamic information.Based on the problems existing in the network information filtering, each key technology of the content analysis is researched. This paper puts forward to feedback mechanism in order to update the template real-timely. This paper is an exploration of feedback mechanism of the network information filtering, but it has a strong theoretical and practical significance. The following respects are mainly studied:1. The research and improvement of the key technology of the Vector SpaceThe key technology of the vector space model includes packet intercepted, feature selection, weight calculation, classification algorithm and so on. This paper has done much work to study the key technology of vector space model and improve further with the aim of improving the efficiency and accuracy of the classification. This paper uses SPI technology to intercept network information. Feature Selection uses removing stop words, Zipf's law as well as the combination of feature selection algorithm. In the text representation, the first work is to preprocess the network information. The second work is to calculate the characteristic features according to the web pages and e-mails respectively.2. The feedback mechanism of network information filtering Since the limitation of the template, the variability of network information and the importance of the unmarked documents, the feedback mechanism is focused on by this thesis. The user feedback technology is studied from the acquisition of the feedback information, the use of the feedback information and the evaluation of the feedback information. The algorithms of the feedback are put forward form the implicit feedback and the pseudo-feedback separately in order to optimize the template. Finally, the feedback mechanism is implemented in the network information filtering system and simulation experiments are carried out. The experiments shows that the feedback mechanism this paper proposed can play a significant role in improving the performance of the network information filtering.3. Mail client filteringWeb pages and E-mails are the two major carrier of network information. Mail client filtering is studied on the basis of the research on the web information filtering. According to the POP3 protocol, emails are intercepted, analyzed and filtered by the multilayer filtering mechanism.4. The design and implementation of the network information filtering systemOn the basis of the study of each key technology in vector space model, each module is designed. Ultimately, a multi-strategy, multi-level, efficient network information filtering system is implemented.
Keywords/Search Tags:Text Representation, Feature Selection, Implicit Feedback, Pseudo-feedback
PDF Full Text Request
Related items