Font Size: a A A

Analysis On Public Opinion Monitoring Of Microblogging Based On The Hadoop

Posted on:2019-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2348330566464274Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Internet public opinion refers to the public through the Internet for some real-life hot spots,the focus of the problem held by strong influence,tendentious comments and opinions.The rapid development of social media in recent years has formed an intricate network of social networks.Social media is subtly changing the way of social interaction in which social media micro blog for the development is a revolutionary.The public will comment on the development and changes in the social issues of concern and show their opinions and attitudes.Some influential and tendentious comments or remarks can spread negative remarks while spreading positive information.Therefore,we must do a good job in public opinion monitoring and analysis and it is imperative to give play to the guiding role of public opinion.In this paper,we use the Hadoop HDFS and MapReduce computing model in collected data pre-processing and storage of massive data for the limitations of data analysis and processing process.Sentiment analysis module put forward the analysis of emotional tendencies and sensitive topic detection.The major work and innovative points of this paper are described as follows:1)In the process of short text clustering,the sparse nature of the characteristic words,the complexity of the high-dimensional space processing are often found.Due to the content length limitation of the micro blog and its feature sparsity,the high dimensionality of feature vectors is performed,resulted in obscured clustering results.A Latent Dirichlet Allocation(LDA)theme model is proposed to the training data,and extend the subject term into the characteristics of the original micro blog,such that to enrich the category features to improve the clustering consequent.Our experiment combines K-means and Canopy clustering algorithm to process the text data and the results achieve higher accuracy and F1-measure.The F1 value improved by 10%,and the accuracy improved by 2%.2)We use Bayesian classifier and Hadoop in handling huge amounts of data,the Chinese academy of sciences segmentation system and the vocabulary of iFLYTEK.Finally,the basic situation of public opinion analysis,the proportion of emotional analysis and the trend of public opinion are presented.3)Sensitive topic detection in the public opinion analysis stage,the sensitivity of the text is calculated through the preset threshold of sensitivity to judge whether the text is a sensitive topic,if the sensitivity of the text is greater than the set threshold The document is judged as sensitive text,in order to achieve automatic and sensitive topic detection.
Keywords/Search Tags:Social media, Public opinion, Clustering algorithm, Hadoop
PDF Full Text Request
Related items