Font Size: a A A

Emotion Analysis Of Chinese Microblogs Using A Lexicon-based Approach

Posted on:2015-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:M H PanFull Text:PDF
GTID:2298330422980968Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, Micro-blogs attracts more and more attention and has become an importantplatform for people to express personal emotions and feelings. Therefore, Micro-blog has become animportant resource for opinion mining and sentiment analysis. Automatic analysis of the emotionalcontent in microblogs plays an important role in capturing popular feelings and adjusting personalmood. In this paper, a lexicon-based approach is proposed to identify six emotions in micro-blog text:joy, sadness, anger, fear, disgust and surprise.Firstly, we perform an extensive analysis of current Chinese emotion lexicons to understand theirroles in analyzing micro-blog text. The experimental results show that lexicon is a crucial resource inemotion analysis. The results also reveal limitations of current Chinese emotion lexicons. Thecharacteristics of emotion in microblgs are identified to build two new emotion lexicons. Theemoticon lexicon EmoDic is created using mutual information. The Chinese emotion lexicon SixDicis obtained using a voting strategy, in which emotion annotation, mutual information, andpart-of-speech tags are considered.Secondly, by studying the characteristics of current Chinese emotion lexicons and Micro-blogemotion expression, a lexicon-based approach to identify six emotions in micro-blog text is proposed.The experimental results using the two constructed lexicons show that the coverage of SixDic is65.8%and the accuracy is64%, which is higher12%than DUTIR. EmoDic achieves a higher recallthan using human-picked emoticons. The system gets the best accuracy (74.1%) and coverage (80.4%)by assigning higher weights to EmoDic and using negation rules.Finally, we compare the performance of different emotion features: ungram, Chinese emotionlexicon, emoticons, negations and punctuations using SVM. The results show that emotion lexiconshave better performance in identifying six emotions in microblogs than ungrams. When combiningthe four features: SixDic, EmoDic, Negations and Punctuations, the best accuracy is61.7%.
Keywords/Search Tags:micro-blog, emotion analysis, emotion lexicon, emoticon
PDF Full Text Request
Related items