Font Size: a A A

Research Of Weibo Public Opinion Analysis Based On Clustering And Deep Learning

Posted on:2022-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:M H LiFull Text:PDF
GTID:2518306788955649Subject:Journalism and Media
Abstract/Summary:PDF Full Text Request
Since the end of 2019,the COVID-19 pandemic has outbroken,spread and mutated around the world,causing widespread concern and active discussion.With the rapid development of Internet technology,online social platforms such as Sina Weibo have become the main way for people to release and obtain information and an important carrier for the development and dissemination of online public opinion.How to use clustering,deep learning and other technical methods to detect topics and emotional trends that people care about most from massive public opinion information,so as to guide positive public opinion dissemination,is a very meaningful research topic today.This paper takes "Research on Weibo Public Opinion Analysis Based on Clustering and Deep Learning" as the title.On the basis of the reference of the local and abroad literature research,this paper adds its own improvement ideas.The research is carried out in two directions:clustering and Deep learning,and applied to the analysis of Weibo public opinion of a specific factual hot event as "COVID-19".(1)Data acquisition and preprocessing,Python language is used to develop a microblog crawler themed on the "COVID-19" in this paper,both blog posts and comments have crawled at the same time,which solves the problem of incomplete public opinion analysis data,and prepares data for the next experiment.The preprocessing stage includes data cleaning,Chinese word segmentation and text vectorization.Word2Vec model is used to train static word vectors,which can extract semantic features more accurately.(2)Topic detection,aiming at the shortcomings of the traditional K-means clustering algorithm,an improved K-means algorithm based on Jaccard distance is proposed.The k most dissimilar documents are calculated by the Jaccard distance as the initial clustering centers,which reduces the number of iterations and improves the classification accuracy.In the experimental part,the Chinese text data set is used to compare with the traditional algorithm to verify the feasibility of the improved algorithm.The silhouette coefficient value of the improved algorithm is better than the traditional algorithm.In the application stage,the preprocessed microblog data is used for classification,and hot topics and new topics are effectively detected according to the word frequency map and word cloud map.(3)For public opinion analysis,this paper proposes to introduce self-Attention and Maxout neurons on the basis of CNN and LSTM neural network model to achieve effective feature extraction,better understanding of contextual semantics and better solution on the gradient dispersion problem.In the experimental part,CNN,LSTM,self-Attention and the proposed CNN-Attention model are constructed at the same time.Through comparative experiments,it's verified that the proposed model is superior to the standard model in F1 evaluation index.In the application stage,the sentiment classification and prediction are performed on the preprocessed blog posts and comment data,and the change curve of sentiment value over time is obtained,so as to analyze the sentiment trend of netizens and the trend of public opinion,which is helpful for relevant government departments to make public opinion judgment and intervention.
Keywords/Search Tags:Clustering, Deep Learning, Sentiment Analysis, Word2Vec
PDF Full Text Request
Related items