Font Size: a A A

The Research Of Microblogging Short Text Oriented Sentiment Analysis

Posted on:2014-09-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:N LiuFull Text:PDF
GTID:1268330425467554Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet, the Web2.0greatly aggrandized users’participation. Through analyzing the sentimental information from user-generated contents, such as online product comments, web documents and so on, we can capture customer’opinions on certain product or event. Micro-blog, as a new social media has already widely used by the public, and its development speed is beyond to our primal expectation’s, and the daily amount of data exchanging is explosive so that it offers new research areas for NLP and provides a large number of new forms of comment texts. Otherwise, traditional text analysis mostly focus on extraction of core content and theme from standard texts, but the new way is needed to analyze micro-blog in short text of intense emotions and single topic. We will then reform the traditional text sentiment analysis models to adaptived the short text of micro-blog.In this paper, three key problems are researched. These contain the classification of subjective and objective texts, the identification of the sentiment polarity of the subjective texts, the classification of multiple emotion class texts. Main research and work results are summarized as follow:Firstly, a method which compounds the Multi-GRAM features and the Multi-POS features to classify the subjective and objective short texts of microblog is presented. In this method, multiple classifiers and ensemble learning are combined to establish the Vote-AdaBoost combination classification model. Through the iterative update of classifiers, the model can build the befit voting classifier combination, and can effectively enhance the objective and subjective microblogging short text recognition accuracy.Secondly, a method for the opinion sentence extraction and its polarity identification in microblogging short text which contain sentimental elements is presented. In this method, the words and other features are combined as sentimental element. The single sentimental element and Composite Sentimental Element are calculated respectively by sentimental dictionary and sentiment-based training corpus. Then the improvement of the sentiment analysis model which uses the Hownet similarity is build. The model is proposed based on the shortest path of key words to acquire the sentiment seed words. The way of optimize the sentiment seed words can make the sentiment value of the sentiment elements more correctly, then the accuracy of the sentiment polarity classification in microblogging short text be improved. Thirdly, a fine-grained sentiment anaylsis method for the extraction of multi-class emotion type is presented. It considers the constraint of tradition classification method which could only classify the binary classes. The new method is combined with TF-IDF method and Variance Statistical formula, the improvement make it Suitable of multi-class sentiment classification. Then the process of fine-grained sentiment analysis is bulit. The firrst step of the Processes is the polarity identification, as the coarse-grained analysis. Then the multi-class sentiment classification is used as the fine-grained sentiment analysis. By contrast the traditional feature extraction method, this method has proved more accurate results, this method is used to praticipate the NLPCC2013evaluation, and the effectiveness of this method is proved by the good ranking of the subimitted data.
Keywords/Search Tags:Natural Language Processing, Text Sentiment Analysis, Microblogging Short Text, Machine Learning
PDF Full Text Request
Related items