Font Size: a A A

The Research On Public Opinion Mining And Group Behavior Analysis On The Internet

Posted on:2011-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiangFull Text:PDF
GTID:2178330332979205Subject:International Trade
Abstract/Summary:PDF Full Text Request
With popularization of Internet and increase of Internet users, Internet public opinion has gradually penetrated into every aspect of our society, economics and politics, etc. Internet virtual group has become a force that can not be ignored to drive the development of Internet public opinion. Therefore research of Internet public opinion mining technology and virtual group monitoring technology is becoming a research hotspot. Public opinion is social group's subjective reflection on specific social phenomenon and reality in specific area of a specific period. As an effective method of public opinion topic detection, Internet public opinion mining technology has become a research hotspot. However, when dealing with great amount of web information, existing Internet public opinion information mining technology has many problems. Besides, research of virtual group's behavior monitoring is still immature. So a breakthrough on theory system and methodology of Internet public opinion mining and virtual group behavior monitoring is urgent.The paper analyses the process of Internet public opinion mining with web information mining technology and optimizes traditional web information crawling and preprocessing technology according to characteristics of production and development of Internet public opinion. Besides, the paper optimizes traditional text clustering algorithm according to characteristics and requirements of web topic detection. Based on social network analysis method, the paper analyses virtual group's structure and behavior rules and summarizes structure and behavior of two typical virtual groups, blog rings and forums, with an example of analyzing their topology and centrality. Lastly, based on above research, the paper designs the architecture and functions of Internet public opinion monitoring prototype system. The main research focuses on the following aspects:Web crawling and information preprocessing technology:In the phase of web crawling, concerning the characteristics of instantly updating and fast spread of Internet public opinion, the paper designs concurrent and incremental web crawler to satisfy the data collection requirement from different web pages of Internet public opinion monitoring system, meanwhile solving the efficiency problem of cosmically web crawling. In the phase of information preprocessing, the paper adopts different web noise reducing methods according to different structures of news, blog page and BBS page. It adopts HTML Parser to collect text of news and blog pages and designs BBS structured information collection method based on DOM tree and templates. Finally after the above research, the paper obtains purified text document for text clustering.Internet public opinion mining algorithm, i.e., text clustering algorithm:The paper optimizes traditional TF-IDF formula to do feature extraction of dynamic text flow produced from web information. Considering the affect of Internet new vocabulary on feature extraction, the paper puts proper weight for new vocabulary to improve the quality of incremental TF-IDF model. In the phase of text clustering, the paper adds "time window" concept in text similarity analysis, which greatly improves the efficiency of incremental Single-pass clustering algorithm, meanwhile cuts down its memory consumption.Internet virtual group behavior research based on social network analysis method:using social network analysis method, the paper does structure analysis and behavior monitoring on virtual group in which group members share information of specific web topic. Besides, the paper does topology analysis, centrality analysis and community analysis on Internet groups and virtual organizations. It draws graph of Internet public opinion group network with visible social network tools and directly presents the behavior evolvement rules of virtual group to users.Based on the above research, the paper designs architecture, function modules and workflow of Internet public opinion monitoring prototype system, which set up the basis for the following system development and application.
Keywords/Search Tags:Internet public opinion, Web mining, social network analysis, prototype system
PDF Full Text Request
Related items