Font Size: a A A

Research On Emotional Information Extraction Method For Chinese Microblogging

Posted on:2016-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LiFull Text:PDF
GTID:2208330452470734Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Internet has been widely spread, and has become the main approach of fetching andsharing information. As a new interactive platform, micro-blog grows to be a part of dailylife for many people. Studies relating to Chinese Micro-blog have attracted much attention.Micro-blog sentiment analysis is a important topic among these studies. Emotioninformation extraction, the basic task of micro-blog sentiment analysis, is getting more andmore popular among researchers and becomes a hot study aspect little by little.The purpose of Chinese micro-blog emotion information extraction is to transfer theunstructured micro-blog text into structured text: emotion information units. It can bedirectly used for user comment analyzing or be used as supplementary knowledge for textemotion classifying. A emotion information unit include opinion target, opinion word,polarity, holder. Due to the arbitrariness of micro-blog expression, some blogs arestructurally imperfection, some contain redundant or unknown words. Therefore, theformer methods of text opinion mining are not suitable. In this paper, improvements of theold approaches are made according to the micro-blog features, the main works can bedescribed as follows.(1) The work of building opinion target candidate set. First, preprocess the micro-blogtext with respect to its trait. Second, distinguish noun phrases by syntactic analyzing. Filterout unnecessary phrases. At last, build the opinion target candidate set consisting nounwords and noun phrases, and perform experiments to analyze to candidate set.(2) The work of Chinese micro-blog candidate set filtering. In this paper, threedifferent approaches of filtering are adopted. First, filter the set with SVM model usingsemantic role information, minimum distance and word frequency as features. Second, byassigning different score to semantic role information, minimum distance and wordfrequency, a overall score can be calculated. Then filter the candidate set according to theoverall score. Third, CRF is adopted for candidate set filtering using emotion words,semantic role information as features.(3) The work of opinion target polarity discriminate. This work is based on rules. Ifthere are emotion words near the opinion target, the polarity of opinion target can bedetermined by the emotion words. Otherwise, the polarity of the opinion target should beassigned with the emotion polarity of the whole sentence. The sentence level polarity iscalculated with native bayes model. (4) The work of implementation of Chinese Micro-blog emotion informationextraction system based on the former study. This system can be useful building theopinion target candidate set, filtering the candidate set, discriminating opinion targetpolarity, and analyzing related data. It can be used in practical emotion informationextraction tasks.
Keywords/Search Tags:Chinese micro blog, emotion information extraction, opinion target, syntactic analyzing, support vector machine, conditional random field
PDF Full Text Request
Related items