Font Size: a A A

The Design And Analysis Of Topic Crawler For Electronic Public Opinion

Posted on:2015-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:J FanFull Text:PDF
GTID:2298330467975255Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the society’s progress, the Internet gradually become an important platform for people to express their opinion. Compared with the traditional media, network has features such as spread rapidly, anonymous users etc., Due to these characteristics, on the one hand, the network of public opinion has played a very good supervision effect, but on the other hand it is also easy to contain the content of the reactionary, superstitious and yellow, brings to the social stability and national security. Therefore, how to timely catch public opinion which current users pay close attention to from abundant network information, and help the government in a timely manner to understand the current social direction of public opinion are very important.Using information technology such as search engine for collecting and monitoring of network public opinion is a practical and effective method. This paper puts forward an improved algorithm of SVM classifier based on incremental learning online. Algorithm in this paper through the improvement of traditional SVM classifier topic crawler and the historical influence on training focus on positive and negative cases samples out with incremental set to training to get a complete training set, in order to improve the obtaining rate, finally on the basis of the algorithm built a theme crawler frame, and applies the above the scraping of the network public opinion, the experimental results show that in the process of network public opinion collection can effectively improve the acquisition of information.In this paper, the specific research work is as follows:One is the information acquisition and preprocessing technique. Study how to collect data from the loose unstructured information. Using the topic crawler purification technology and web technology, Chinese word segmentation technology. To realize the automatic acquisition and structured storage of network information. The Second is the research of SVM classifier algorithm. Artificially set a theme, and by training set to obtain a topic vector model, Calculate relevance computation between training vector model and the grabbed pages, Then pick up the high relevant web pages, the experimental data show that the precision and recall as well as the topic crawler access, etc., the results of the improved SVM classification method are obviously better than traditional SVM classification methods. Thirdly, Completed a network public opinion grab prototype system. On the basis of above research, this paper implements a topic crawler system for network public opinion, and to grab the public opinion, through practical application, the system effect is good.
Keywords/Search Tags:Topic crawler, SVM, Incremental learning, network publicopinion
PDF Full Text Request
Related items