Font Size: a A A

Key Technology Research And Prototype System Implementation Of Weibo Public Opinion Monitoring

Posted on:2019-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:H Y QinFull Text:PDF
GTID:2348330563453967Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet and mobile handheld devices,Weibo has developed rapidly in a short period of time due to its low threshold and freedom,becoming an important stage for people to exchange and acquire information.Among many microblog platforms,Sina Weibo is the most popular,with the number of monthly active users reaching 376 million in 2017,and there are more than 100 million new weibo being generated per day.It is not difficult to see that the user groups of Weibo are very large and there is rich and valuable information on Weibo.These information on Weibo imperceptibly influences people's lives and to a certain extent has affected the development of society.The amount of information on Weibo is so huge that it is necessary to sort out a wide variety of information.This demand is necessary and imminent.It is necessary to dig out the influential hot topics from hundreds of millions of microblogs,supervise and regularize network behaviors,create a good online environment,and reap valuable information from them.This has an important role in promoting the harmonious development of society,the creation of a healthy network environment,and the guidance of active Internet media.The Weibo public opinion monitoring system is used to realize the mining and analysis of hot events.This thesis mainly studies the key technologies related to Weibo public opinion monitoring and describes the implementation of its prototype system.The research work of the thesis mainly starts from the following aspects:Firstly,this thesis introduces the purpose of Weibo's lyrical research and the research results at home and abroad.Then it introduces two methods of collecting Weibo data: Web crawler and Sina Weibo API.Web crawler is a traditional way of obtaining web page data.The microblog open platform is mainly an API interface provided by the microblog official for the user to call.Secondly,data preprocessing requires Chinese word segmentation first,and this paper uses the ICTCLAS system to segment text.This system has fast word segmentation and high accuracy,and it also supports humans to add new custom word banks.The word segmentation needs to extract key words and using TF/IDF algorithm to extract features.This thesis has introduced many common models on the text representation model,and finally uses the widely used VSM vector space model for text representation.Finally,in the public opinion analysis stage,the most important thing is to use appropriate and effective clustering algorithms to cluster texts.This thesis compares the advantages and disadvantages of some traditional clustering algorithms.Focusing on the specificity of microblogging texts,this thesis proposes an improved algorithm and has verified it.The clustering results are the basis for subsequent Weibo public opinion analysis,hot topic detection,and text orientation analysis.
Keywords/Search Tags:Internet public opinion, web crawler, text segmentation, text processing, clustering algorithm
PDF Full Text Request
Related items