| With the development and popularization of the Internet,the Internet has increasingly grown into a public platform for the public to express their opinions and supervision by public opinion.At present,micro-blog,as a social platform of network information exchange with a huge development scale in China,has become the focus of the network to express public opinions.A large number of comment data contain the emotional information and subjective views of the public towards the comment object.This paper takes the comment data of hot topics on weibo as the research object,adopts the emotional tendency analysis method based on the emotional dictionary of online comments and combines with the means of emotional visualization to analyze the distribution characteristics of emotional tendency behind the online comment data and explore the public's emotional tendency.The specific research contents are as follows:(1)Design crawler program for network comment data.The problem of weibo identity authentication was solved by simulating login,and the comment data was filtered and extracted by combining with keyword matching function.Combined with web crawler,BeautifulSoup,regular expression and other technologies to achieve the collection of user information and comment information,to provide data support for subsequent research.(2)Build an emotional dictionary that applies to the comment data.The network neologisms provided by baidu and sogou input method were selected as the candidate words for constructing the dictionary of network buzzwords.The candidate words were screened through the microblog corpus to obtain the network buzzwords,and the emotional polarity was determined by the point mutual information algorithm.After integrating the constructed network buzzword dictionary with the existing open source dictionary resources,the comment sentiment dictionary containing the network buzzword dictionary is finally formed.(3)The review-oriented emotional dictionary is extended to calculate emotional orientation.It includes three methods:expanding the emotional dictionary based on thesaurus,emoticons and syntactic rules,and improving the coverage of the emotional dictionary.In the experiment,the emotion dictionaries before and after the expansion are used in the calculation of emotion tendency.(4)The realization network comment sentiment analysis and the visualization.The preprocessing of stop words and word segmentation was carried out on the comment data to identify positive and negative polarity category emotion words,and the weight of weighted items such as degree adverbs,negative words and emoticons was taken into account to obtain the value of emotion tendency of the whole comment,and then to identify the emotion tendency.Statistical and data visualization methods were used to study the characteristics of comment word clouds,regional heat and age groups,and to analyze the characteristics of public emotions. |