Font Size: a A A

Emotion Recognition And Classification Research For Chinese Microblog Text

Posted on:2015-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:W Z PengFull Text:PDF
GTID:2268330428967680Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As one of the most popular social media, Microblog has the significant characteristics of rapid information spread, large amount of information, and nonstandard word usage, etc.. It has become one of the most important platforms for information exchange. Currently, research on emotion recognition and classification of Microblog text has become a hotspot and difficult issue in the field of natural language processing. The research of Microblog has important and practical effect on assisting enterprises to understand and discern their users’feedback about the products or services, accessing to social public opinion and public opinion monitoring.The preliminary study of this thesis is to solve emotion recognition and classification problem for Chinese Microblog texts, and its major contents are as follows:Ⅰ. We conclude the structural features of the Microblog texts and classify the expressions that are frequently used in the Microblog text and those made up of the punctuation marks on the Internet into different categories, then we form these expressions into an expression emotion library. Furthermore, we collect the repetitive punctuations that often appear in the Microblog text and form the repetitive punctuations into a punctuation emotion library which can be used in the recognition of emotion classification. We construct a Chinese Microblog emotion library based on the libraries mentioned above.Ⅱ. We extract emotion features from the experimental test corpus by use of the statistics of word-occurrence frequencies, Cross Entropy, TF-IDF and the Variance methods. According to the experimental results, it can be found out that the variance method combined with TF-IDF achieved the best result.Ⅲ. For emotion recognition and classification of Microblog texts, we first check whether the texts are subjective or objective and we identify the subjective sentences by Naive Bayes and Support Vector Machine. The experimental results show that the Naive Bayes got the better result in this case. Then, we classify the Microblog texts that belong to the subjective cases by fine-grained classification. In the experiment, we use;1-v-1support vector machine and1-v-r support vector machine to classify the texts, and we find that the1-v-1support vector machine method is more effective than1-v-r method. IV. Based on the approaches we mentioned above, We developed a prototype system and verified the feasibility and effectiveness of the proposed approaches by a series of experiments on the open available data sets.
Keywords/Search Tags:Microblog content analysis, Subjective sentence recognition, Emotion classification
PDF Full Text Request
Related items