| Hot words analysis is an important natural language processing technology.This thesis aims to mine hot words in hot events,and analyze the emotional trend of hot words communicators.It can be applied to topic monitoring,words of mouth analysis,public opinion analysis and other fields.The existing hot words extraction methods,due to ambiguity,inconsistent standards and large amount of text corpus data,lead to incomplete selection of hot words candidates and slow selection speed;The evaluation method of hot words depends on the restriction of current words frequency and historical words frequency,which leads to the unclear distinction of hot words.Aiming at the above problems,this thesis carries out hot words extraction and emotion analysis for Weibo hot topics.The main work includes:(1)A words extraction method based on cohesion attention mechanism(CAM)is proposed to solve the problem of incomplete selection of hot words candidates and achieve the goal of fast and accurate words extraction.In this thesis,the words is taken as the basic semantic unit of BERT pre training model,and the strong representation ability of BERT coded words vector is effectively combined with a rich corpus to obtain words vector sequences.Then based on attention and mutual information,the cohesion attention mechanism splices the words vector to get the words vector,so as to extract the words in the microblog text as the candidate words set for extracting hot words.(2)A hot words extraction method based on dynamic co-word network(D-Co-net)and CAM is proposed,which solves the problem of unclear hot words recognition and achieves the goal of accurate hot words extraction.This thesis analyzes the temporal and spatial change characteristics of hot words propagation,introduces quantitative words popularity index,and obtains the circle coefficient and mutation coefficient by building a Co-word network to effectively remove low-frequency words,merge synonyms,and extract hot words propagation characteristics.Combined with mutation coefficient and out of circle coefficient,the words heat of words is calculated by the words heat calculation formula,and the words with high heat value are selected from the hot words candidate set as hot words.(3)A method for analyzing the emotional trend change of hot words based on bidirectional gating recurrent unit and three-layer feedforward neural network(BiGRU-3Net)is proposed,which deeply captures the context semantic information and emotional features of microblog text in two-way coding,solves the problem of shallow representation of the context semantic information and emotional features of microblog text,and achieves the goal of accurately analyzing the emotional trend change of hot words.In this thesis,a simple classification model of text emotional polarity is formed by combining bidirectional recurrent neural network and three-layer feedforward neural network to judge emotional orientation(negative,neutral,positive).Then further consider the time factor,segment the text data by time slice,and judge the emotional polarity of the text by time periods based on the BiGRU-3Net model to fully display the emotional trend of hot words.The experimental results show that the hot words extraction and emotion analysis methods for microblog hot topics improve the accuracy of hot words extraction,comprehensively analyze the emotional trend changes in the process of hot words transmission,can effectively grasp the hot words of hot topics and their emotional trend changes,and can provide comprehensive and accurate monitoring for numerous hot topics.Figure [22] Table [14] Reference [80]... |