Font Size: a A A

Research On Network Public Opinion Based On Text Tendency Analysis

Posted on:2018-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhangFull Text:PDF
GTID:2428330596465404Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the continuous development of network media,micro-blog,headlines and other changes in people's way of life,but also constantly produce data closely related to people's lives.These data have a large amount of information,and can reflect the subjective emotion and so on.In this context,more and more attention has been paid to the analysis of network data.Because of the openness and convenience of the network,the public can express their opinions through the Internet,which provides an effective channel for the government and regulatory authorities to understand the public opinion accurately and quickly.By monitoring the network public opinion of hot spot events in the network,the regulatory authorities can quickly understand the opinions of the public and make proper response and decision-making,which is of great significance to maintain social stability.This paper describes the research work on the topic of online public opinion hotspot from three aspects: first,focus on the stage in the data collection of reptiles discussed some topics related to collection,network data process problems and solving methods in the network;followed by the discovery of new words and new words text sentiment orientation identification based on;the last is the analysis of hot topics the text sentiment,and on this basis to analyze the network of public opinion.The main contents of this paper are as follows:1)In view of the hot topic collection method,this paper uses focused crawler to crawl the topic related web pages in the internet.In this paper,we propose an improved text keyword extraction method for the shortage of keyword extraction methods in traditional focused crawler.On the basis of other literatures,this method adds the dependency parsing of machine learning to the keyword extraction process.In order to improve the evaluation quality of link priority,the text distance information is added to the similarity between the anchor text context and the key words for the priority evaluation of the focused crawler.In the implementation of focused crawler,we solve the problems of URL weight removal,IP restriction and so on.2)Network text data has the characteristics of colloquial and informal,and often there are no words in the dictionary.In this paper,new words are found based on the network text corpus,and the affective tendencies of new words are determined.New words are discovered by means of word co-occurrence frequency,left and right entropy,and word segmentation tools.Have been found in the new words on the basis of using the SO-PMI method and the old sense of word dictionary to determine sentiment words,adding new emotional words to the emotional dictionary,to complete the expansion of the emotion dictionary,for text sentiment analysis provide the basis for micro-blog.3)In the hot topic of public opinion analysis,this paper compared the frequency,information gain,chi square statistic and other feature extraction methods,choose the feature selection method based on Chi square statistic,and negative word features added to the model,training semi supervised learning model of the network of emotional judgment,in order to get the results of the analysis of public opinion.
Keywords/Search Tags:Public opinion analysis, tendency analysis, new word discovery, focused crawler
PDF Full Text Request
Related items