With the rise of big data,the explosive increase of data information makes online public opinion more complicated and poses new challenges to public opinion governance.Topic state is an important dimension of topic evolution analysis.From germination to decline,a topic goes through a series of different degradation states.How to correctly identify the present status of the topic and further predict the growing trend of the topic is of great significance for the government and public opinion supervision departments to make scientific early warning decisions.This paper took topic state as the main research object,incorporated netizen sentiment into topic state identification index,identified the state of topics that are in evolution,and described the trend of topic state more accurately by predicting multiple topic indexes.Firstly,this paper introduced the method of topic state phase division and constructed the topic lifecycle state index.Qualitative and quantitative methods were used to divide the topic lifecycle into four state stages:germination,growth,maturity,and the decline.Characteristic variables with high correlation with the topic state were explored,and three-dimensional state indicators were constructed,including novelty,attention,and sentiment.Then,the topic state identification and trend prediction methods were proposed respectively based on GMM-HMM.With the principles and advantages of Hidden Markov model(HMM)and Gaussian mixture model(GMM),the topic life cycle state was taken as the hidden state of HMM,the topic state indicator was taken as the observed variable of HMM,and the probability distribution of observed variable under each topic state was simulated by GMM.Therefore,the Gaussian mixture Hidden Markov Model was constructed and trained to identify the state and predict the trend of the evolving topic.Finally,the experiment of topic state recognition and trend prediction were conducted.Taking microblog topics as an example,the key data of the topic were extracted and preprocessed,and the key time nodes of each life cycle stage were obtained by the topic state division method.The experiment was conducted by using the topic state recognition and trend prediction method,and compared with other methods.Besides,experiments were conducted for the data results of the generated state stages under different division methods to compare the scientific validity of different division methods.The results were obtained: based on the results of topic states classified by Gompertz curve,the F1 value and accuracy rate of topic state identification using GMM-HMM were higher than 87%,and the MAPE of trend prediction was lower than 3.5%,which had greater advantages compared with Gaussian HMM and BP neural network. |