Font Size: a A A

Study On Emotion Classification Of Chinese Text Oriented To Micro-blog Reviews

Posted on:2019-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y J GuFull Text:PDF
GTID:2348330542498985Subject:Statistics
Abstract/Summary:PDF Full Text Request
Micro-blog as an important communication tool,has been applied to all aspects of the society,many stars have also opened micro-blog to communicate with their fans,star who has high popularity has a huge fan base,resulting in huge amounts of information in the micro-blog comments,these information often present in the form of text.The efficient mining of the hidden emotional information in micro-blog comments is of great value and significance.At present,there are lots of researches on user sentiment in social media,and many achievements have been achieved.However,most scholars' research is limited to user reviews,but not with the content of reviews.This paper,based on the content of micro-blog reviews,analyzes the impact of star micro-blog's emotional orientation on the sentiment tendency of micro-blog comments.The main work of this article is as follows:First,word segmentation and tagging and using part of the word-based words to build a basic emotional dictionary,a degree adverb dictionary and a negative word dictionary artificially,which are suitable for the analysis of the star micro-blog reviews.The basic emotional dictionary consists of 1852 positive emotional words and 1499 negative emotional words.The degree adverb dictionary contains 259 words,and the negative word dictionary contains 125 words.In addition,a set of rules for marking micro-blog reviews' emotion are also designed: Starting from the first word of the first comment,if a positive(negative)emotional word appears,and the former word of this positive(negative)emotion word is neither a degree adverb nor a negative word,add(minus)one point to this comment.If a positive(negative)emotional word appears,and the former word of the positive(negative)emotion word is a degree adverb,we will add(minus)the degree adverb's value to this comment.If there is a positive(negative)emotion word and the former word of the positive(negative)emotion word is a negative word,the former word of the negative word is not a degree adverb,minus(add)one point to this comment.If there is a positive(negative)emotional word and the former word of the positive(negative)emotion word is a negative word,the former word of the negative word is a degree adverb.The value of the degree adverb should be subtracted from(add to)this comment.Based on dictionaries and scoring rules,samples are scored,and the effectiveness of emotional scoring is verified by calculating recall rate,accuracy rate and F-measure value.Then,the micro-blog emotion classification result based on the dictionary method is used as the supervision item to instead the manual annotation.Setting up a characteristic matrix with the feature of the basic emotional words.The element in the matrix represents the number of n feature in the M sample,that is,the frequency of the n emotional word in the m sample.Using the statistic to reduce the dimensionality of the feature matrix and calculating the CHI values for each category of feature.Then selecting the maximum CHI value as the final CHI value of,selecting the final CHI value ranked the first 1000 features,using the decision tree method to classify samples,calculating recall rate,accuracy rate and F-measure value.Finally,the blogger's writing is manually graded.Approaching the micro-blog emotional classification results based on a dictionary as an oversight item and building a feature matrix based on the characteristics of the blogger's sensibility score and the first 1000 basic emotional words of the final CHI value.The decision tree method is used to classify it,and the recall rate,the correct rate and the Fmeasure value are calculated.Comparing the experimental results,it is found that the performance of classifiers is significantly improved after adding bloggers' score,which shows that the emotional tendency of bloggers has a certain effect on the sentiment tendency of micro-blog comments.
Keywords/Search Tags:micro-blog, emotional dictionary, machine learning, feature selection, decision tree
PDF Full Text Request
Related items