Font Size: a A A

Public Opinion Analysis Of Weibo Based On Density Peak Fusion K-means Clustering Algorithm

Posted on:2021-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:J M YeFull Text:PDF
GTID:2428330611497691Subject:Engineering
Abstract/Summary:PDF Full Text Request
Social media has developed rapidly in recent years,and Weibo has gradually developed into an indispensable social media in people's daily lives.Public opinion on Weibo has a strong influence on society.It reflects the public's views and attitudes on various events in real time.It is an important channel for the government to grasp the public opinion and development of events,and for companies to understand public opinion.Therefore,timely and accurate analysis of public opinion on Weibo is of great significance for correctly predicting and controlling the development of social events,for promoting national economic construction and maintaining social stability and unity.This article will carry out in-depth research on microblog public opinion analysis technology,including four aspects of microblog data acquisition,microblog text preprocessing,text clustering and public opinion result analysis.The research work of this paper mainly includes the following three parts:The first part uses density peak algorithm(CFSFDP)fusion K-means algorithm to complete microblog text clustering.The density peak quickly and accurately finds the cluster centers to make up for the lack of randomness when the K-means algorithm selects cluster centers.However,the density peak algorithm has some shortcomings that the cutoff distance needs to be set manually and the cluster center is selected subjectively.This paper proposes a cutoff distance selection strategy,finds the optimal cutoff distance,then normalizes the layout density and relative minimum distance,introduces a slope change calculation method to automatically determine the clustering center,and finally uses the K-means algorithm to simplify Easy-to-operate features iterative clustering.The experimental analysis of the fusion algorithm and other algorithms is carried out to verify the accuracy and stability of the proposed algorithm in text clustering,and better explore the hot topics of Weibo.The second part analyzes the Weibo API collection process,user authentication,webpage capture and Weibo site.In the web crawler part,the self-developed crawler script is used to simulate browser requests,store and parse data to achieve microblog data crawling.Through the experimental comparison of the two data collection methods of Weibo API and web crawler,the advantages and disadvantages of each are analyzed,and the data collection method is selected according to the experimental conditions and requirements of this article.The third part puts the technical and theoretical research of the above two parts into the practical application of Weibo public opinion analysis.The most important thing for public opinion analysis is data.For data processing,Python language is used to complete data cleaning,Chinese word segmentation,stop words,feature weight calculation and text vector representation.For public opinion analysis,the Boson NLP sentiment dictionary is used for sentiment analysis,and AC automata algorithm is used to complete the detection of Weibo public opinion sensitive information.
Keywords/Search Tags:microblog public opinion analysis, density peak, K-means algorithm, text clustering
PDF Full Text Request
Related items