Font Size: a A A

Research On Feature Generation Methods For Text Sentiment Classification

Posted on:2017-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhouFull Text:PDF
GTID:2308330485971120Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Text sentiment classification is an important task in text data mining which aims to detect authors’sentiment from the text they write. With the rapid development of Internet, users post their reviews and opinions on different on-line social platforms. If such reviews and opinions can be collected and used in text sentiment classification to summarize the author’s sentiment, they will end up providing much useful information in different domains such as business, politics, medicine and so on.In traditional text sentiment classification problems, one of the main difficulties lies in the huge number and sparse distribution of features. Therefore, it is important in text sentiment classification to utilize an efficient feature selection method to generate a suitable feature subset which can reduce the time cost and improve the classification performance at the same time. Besides, the sentiment polarities of texts are usually relevant to the semantic meanings of words. If we can conduct detailed analysis on the semantic meaning of words to assist sentiment classification, the result of sentiment classification will be improved.In this paper, we mainly focus on generating useful features in text sentiment classification domain and using those features to boost the accuracy of text sentimen-t classification. By researching on the problems mentioned above, we propose two feature generation methods which are shown below:Firstly, we propose a feature selection method which is based on improved Parti-cle Swarm Optimization(PSO) algorithm. Compared to traditional PSO-based feature selection methods, our proposed method modifies the update formula of velocity in order to make it more suitable for feature selection problems. Besides, our proposed method combines the concept of wrapper method and filter method which are two com-mon methods in feature selection domain, and takes the semantic meaning of feature words into consideration so as to increase its performance in text sentiment classifica-tion domain. Experimental results show that our proposed method can generate better feature subsets both in traditional classification problems and sentiment classification problems.Secondly, we design two different methods to automatically generate sentiment phrases from the testing dataset to form a sentiment lexicon which can be used in text sentiment classification. Compared to traditional sentiment lexicons, the lexicon generated by us focuses more on the topic domain the dataset belongs. Therefore we call the lexicon generated by us domain-specific sentiment lexicon. Experimental results show that the domain-specific sentiment lexicon can reach better results than the traditional one on text datasets from the corresponding topic domain. Besides, we combine the lexicon-based text sentiment classification method with supervised learning in order to further improve the classification accuracy.Lastly, it is worth noting that text sentiment classification usually aims at clas-sifying the whole document into one specific sentiment category, which is named document-level sentiment classification. Document-level sentiment classification will face difficulty if a document mentions different aspects of the entirety and gives dif-ferent opinions on them, which results in the development of aspect-level sentiment classification. Aspect-level text sentiment classification first detects the aspects men-tioned in the document before assigning them with proper sentiment labels, which is more reasonable than document-level text sentiment classification. In the last of the paper, we will propose different ways of labelling the aspects of terms in the domain-specific sentiment lexicon. The improved lexicon with aspect information will then be used in aspect-level sentiment classification. Experimental results show that our methods work well in aspect-level sentiment classification.
Keywords/Search Tags:Text sentiment classification, Feature selection, Particle swarm optimiza- tion, Sentiment lexicon, Aspect-level text sentiment classification
PDF Full Text Request
Related items