Font Size: a A A

Classification And Recognition Of Network Pseudo-public Opinion Based On Machine Learning

Posted on:2021-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:J Y XiaoFull Text:PDF
GTID:2428330614454487Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Today,people are more willing to use mobile internet and smart phones,and their attitude towards network information is also faster and more direct.The network has become the main carrier of public opinion,and what follows is network public opinion.However,everything has two sides,the network pseudo-public opinion also appears and affects and disrupts people 's lives,business operations,and policy decision of government.In order to reduce its bad influence,improve the Internet experience of Internet users,and optimize the network environment,we need not to only identify the pseudo-public opinion from complicated network public opinion environment quickly but also classify the network pseudo-public opinion,so that relevant departments can " counter with proper measures".This paper takes the propagation data of popular Internet events as the research object,and uses machine learning classification and clustering models to achieve the above objectives.The specific research contents are as follows:(1)Introduce the background and significance of the research on the classification and identification of network pseudo-public opinion,elaborate the current status of academic research on the online pseudo-public opinion,summarize the shortcomings including untimely index,single model,and unclear application effects.Lastly put forward the research direction of this paper.(2)Collect real-time propagation trend data of network public opinion events by time period and channel,that is counting the number of transmissions of high-impact users in each channel in each hour,and then draw the propagation trend graphs of real public opinion events and pseudo public opinion events respectively.The first type of indicators can be obtained from selecting or quantifying the principle data according to differences,including the effective propagation time,the number of hot discussions,the amount of propagation per unit time,the proportion of sub-channels,and the degree of dispersion of communication channels.The second type of indicators which describe information source characteristics,including source channels and user influence index.Then select two machine learning models,Logistic Regression and SVM,as candidate models for identifying network pseudo-public,and determine the better model through model training and effect comparison.(3)After completing the efficient identification of the network pseudo-public opinion,then classify the network pseudo-public opinion events.In order to subdivide the network pseudo-public opinion,another index system which is characteristics of public opinion events is constructed,mainly contain macro indicators such as influence index,duration,public opinion field ratio,peak propagation speed and the number of propagation channels.Based on the new data,the cluster analysis got three categories.Combined with the clustering results and actual communication performance to summarize the characteristics of each category,and finally put forward targeted suggestions for the precise governance of the relevant departments.The thesis provides a reference for improving the prevention and control of network pseudo-public opinion by relevant departments through quantitative analysis of network public opinion data.It has certain guiding significance and practical significance.
Keywords/Search Tags:network pseudo-public opinion, machine learning, public opinion control, public opinion index, cluster analysis
PDF Full Text Request
Related items