Font Size: a A A

Design And Implementation Of Public Sentiment Analysis System

Posted on:2014-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2298330467463604Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the proliferation of Internet and the advent of big data, network media has become the main channel for people to obtain information and the increasing influence of network comments is also a significant problem. The public sentiment on network spreads fast and widely, especially in social emergencies that often triggers the common concern of political forces and various social groups, if not controlled in time will cause serious consequences. Because of the openness, freedom and passivity, public sentiment happening on the network, domestic regulators has been the lack of effective supervision and that induce the impact of public sentiment events having involved in all levels of government department, such as army, province department Inc. Therefore, with the high requirement of real-time and massive data system, the artificial classification of public opinion obviously can not meet the current demands. How to use automated way in the huge flow of information quickly, accurately find the sensitive topic, and the development of public opinion, is the basic requirement of automatic system of public sentiment analysis. This work is based on the actual needs of the project, using BBS and news websites as the main object of the data, implement public sentiment analysis system. The main work and innovation is as follows:Firstly, this paper introduces the related techonology of public sentiment analysis system and the sub modual of this system from two aspects of theory and practice, including the crawler technology and Natural Language Processing technology etc.Secondly, the overall architechture and the system module are given in this paper, for the existing problems in current system we presents a method for text feature selection, this method can dynamically determine the threshold according to the text data instead of assigning a fixed threshold value. As another innovation, this system used the topic model techonoly to extract the main information of the sentiment text for reducing the workload of administrator.Finally, this paper adopts a distributed storage scheme, for BBS, WeiBo, News, and some intermediate calculation resuls stored. MongoDB, a key-value distributed non relational database which is acted better performances on flexibility and scalability, is very suitable for text analysis system.
Keywords/Search Tags:public sentiment analysis, feature selection, topic model, big data
PDF Full Text Request
Related items