Font Size: a A A

Design And Implementation Of The Topic Detection System Based On The Mass Public Opinion Information

Posted on:2014-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:S C WangFull Text:PDF
GTID:2268330392962810Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapidly growing popularity of the Internet makes it an important carrier of public opinionpropagation. It is both a direct channel for the government to understand needs of the People,and also the new situation under the Government’s important opinion positions. Networkpublic opinion was wrong control and guidance may affect social stability, therefore, theeffective monitoring of the network public opinion has a certain practical significance.This article stems from a public opinion monitoring project, the project monitoring andanalysis of network public opinion, and to provide guidance and control platform. Theplatform for the acquisition of a variety of sources of information such as news, forums,microblogging and blog, and to achieve sub-category, site and geographical analysis andpresentation of public opinion information, including network public opinion of the specificgroup of people, social hotspot of public opinion, public opinion abroad, the network friedhand and the random collection of public opinion information tracking, early warning,response and microblogging government management.Firstly, text data extracted from the database of the website, its Chinese word segmentationpreprocessing. Preprocessing eliminate noise word and stop words in the text and textreasonable semantic segmentation. Secondly, use Map/Reduce technology and build LDAmodel which is a distributed large data text information processing, after pretreatment the textvocabulary clustering, and use the model to identify the distribution of vocabulary in theme(topic). Finally, vocabulary extracted using K-means to analyze keywords and hot topics ofthe network public opinion. This article gives a public opinion information data mining modeland the information can be used for analysis, clustering and extraction implied theme, andimplements the mass of public opinion data storage, processing and analysis method.According to the design specifications of the software engineering, network public opinion onthe development of systems analysis and design, the main module functions in the systemhave been achieved. The actual trial of the system is able to extract the actual hot topic in network intelligence information to achieve the desired goal.
Keywords/Search Tags:Distributed technology, MapReduce, LDA model, Clustering, K-means algorithm
PDF Full Text Request
Related items