Font Size: a A A

Discovery And Analysis Of Hot Topics In Internet Public Opinion Monitoring System

Posted on:2019-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:S GuoFull Text:PDF
GTID:2428330548960176Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the development of the Internet,those new media transmission forms,such as the microblog and the WeChat,become increasingly popular in the field of information dissemination.They have many advantages like the various and informative content,fast spread,wide dabbling range.When the sensitive events occurred in our life,public opinion will rally to network quickly,and even form a strong group.Further,the Internet public opinion events have come after,and even some public opinion may endanger national harmony and stability.So,founding and analysing the Internet public opinion hotspot is particularly significant.The discovery and analysis of public opinion hotspot can not only help decision makers to quickly found that Internet users' focused topic,and also can help them to analyse and forecast public opinion's direction and trend,further to enable decision makers to prevent network public opinion storm in advance.Due to Internet data's explosive growth model,the traditional technology in obtaining and store huge amounts of data is increasingly difficult to meet the requirements of practical application.Therefore,basing on the big data,it is the vitally significant to design and complete the real time detection and analyse internet public opinion hot spots.For the above analysis,the following works are accomplished in this paper:First,the real-time incremental crawler for the release information of Sina microblog platform is designed and completed,and the text grab algorithm and incremental web crawler method based on tree structure is given.Secondly,the hot topic discovery model is provided and in order to find the text pre-processing in the process,the Chinese word segmentation algorithm based on the discovery of Internet neologism is given,and connecting with the Hadoop platform,the effectiveness of the proposed algorithm has been verified by experiment.Thirdly,in term of topic analysis,the multiple clustering algorithms has been analysed,finally the classic incremental clustering algorithm: Single – Pass is selecte;In view of the disadvantages of the algorithm,the improved Single-Pass clustering algorithm is given to solve the problem of algorithm's sensitivity to the order of the input data and the efficiency in the clustering process;Features of Internet public opinion are analysed eventually,And the Internet public opinion hot spot analysis model is built.Incremental data acquisition method presented in this paper can realize incremental gathering huge amounts of data,reduce duplication of data and improve collectionefficiency.Compared to traditional text mining model in dealing with the problem of low efficiency of large amounts of data,the discovery and analysis method of hot topics provided in the paper,to some extent,can solve the problem and have higher practical value.
Keywords/Search Tags:Public Opinion Information, Internet worm, Hot Topic Discovery, Hadoop Platform, Micro-blog
PDF Full Text Request
Related items