Font Size: a A A

Research And Application Of Network Public Opinion Analysis System Based On Text Clustering Technology

Posted on:2017-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LiFull Text:PDF
GTID:2358330482497639Subject:Computer science and technology
Abstract/Summary:PDF Full Text Request
The booming of internet technology in recent years has triggered the trend of using network platforms to express opinions, attitudes and feelings on social events by the public due to the platforms'features of equality, openness and concealment. This may either have a positive guiding meaning to social events or generate negative influences. Therefore, to effectively digging internet public opinion information has a realistic significance on learning about popular will, guiding public opinions and maintaining social stability.Online public opinion analysis techniques refers to natural language oriented data mining technology. Essentially, these techniques involve digging laws and meaningful information out of a large quantity of unorganized data. Compared to traditional data mining technology, they focus on text processing, including how to transform natural language into computer recognizable languages, how to carry out semantic analysis of text data and how to effectively process text data.This paper mainly explores the practical application of text data mining technology in public opinion analysis. Based on a study of text mining methods and former theories, a complete analysis process of text data was proposed to realize the data mining of natural language information. Besides, an overall structure of public opinion analysis was established, and system functions and database structure were designed to realize the analysis and utilization of online public opinion information. Text mining, consisting of text structuralization and text clustering, mainly involves text segmentation, text representation, feature selection, similarity comparison and etc. In this study, the CAS ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System) was adopted for segmentation of linguistic data, VSM was selected for representation and TFIDF was used for feature selection of segmented data. Taking vector angle cosine as an algorithm for text similarity calculation and K-means for text clustering, problems existing in the K-means were analyzed and improvement measures were explored and proved.Overall structure of the public opinion analysis system was designed. The system is composed of three subsystems, namely, data gathering subsystem, data analysis subsystem and display subsystem. Its databases include basic database, analytical database and display database. Testing indicates that this analysis system can effectively carry out semantic analysis and data digging of text datasets, meeting the design requirements of online public opinion information analysis.
Keywords/Search Tags:online public opinion, text mining, feature selection, text clustering
PDF Full Text Request
Related items