Font Size: a A A

Research On Chinese Emotional Dictionary Construction Method Based On Micro-blog Emoji

Posted on:2019-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y F JiaFull Text:PDF
GTID:2348330569478259Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As one of the most popular social media,micro-blog is a social platform for users to publish,communicate and disseminate information.With the continuous development of mobile Internet,the number of micro-blog users rising year by year,the resulting massive data information plays a vital guiding role in the initiation and dissemination of public opinion events,and provides important data support for public opinion monitoring and text processing.Micro-blog is the product of the period,and micro-blog has different characteristics from traditional texts.For instance,micro-blog news,user status and user reviews,the content length is not more than 140 words,and the content of the micro-blog contain text,images,hyperlinks,and other data formats.Therefore,in the process of analyzing and processing micro-blog,other formats of data information cannot be ignored.Emoji as a new kind of network language commonly used in modern social platform,in the micro-blog,can appear more or less emojis,and even individual micro-blog just made up of continuous emojis.Emojis,therefore,are often able to substitute for users express emotion image tools,contains a rich emotional information,this tendency in micro blog has played a vital role in the analysis.The thesis proposes an analysis method of Chinese micro-blog text orientation based on emojis.Micro-blog data was collected through Sina micro-blog public API,and micro-blog texts were preprocessed.Seed emojis were selected as the conceptual features.Emojis were divided into five emojis: happy,affection,anger,sadness,and disgust.Through the calculation of the mutual information between the seed emoticon and a large number of micro-blog texts,the positive and negative sentiment classification and emotion classification of the micro-blogs text are performed.On the basis of the annotated corpus,the extracted emotion words are annotated and the existing sentiment dictionary is added.Screening,integration,and the addition of a large number of modern web vocabularies to generate a new emotional dictionary.The dictionary contains web emotion words,traditional emotion words,and common vocabulary words in micro-blogs.It is intended to provide corpus support for researching micro-blogs and other social network texts.The thesis takes the construction of text sentiment lexicon as a goal,categorizes the emotive words by annotating the micro-blog text,and uses the mutual information as a classification criterion to calculate the mutual information between the emotional words and the micro-blog texts.The emotion words are marked as happy,affection,anger,sadness.Dislike five emotional categories to achieve automatic construction of emotional dictionaries.In the process of emotional dictionary construction,the emotion classification and emotion classification of micro-blog text are realized.Through a series of comparative experiments,this paper shows that the sentiment classification method can improve the classification accuracy.In the aspect of automatic construction of sentiment dictionary,the accuracy rate,recall rate and F value of the sentiment dictionary in this paper exceed 80% under five emotions.In terms of text sentiment and sentiment classification,the emotional sentiment vocabulary,How Net,and other universal sentiment dictionaries of Dalian University of Technology were compared.The experimental results show that the sentiment lexicon generated by this method has a good evaluation effect and can well cover the data of micro-blog.
Keywords/Search Tags:emotional dictionary, micro-blog, emoji, sentiment analysis
PDF Full Text Request
Related items