Study On Chinese Text Sentiment Classification

Posted on:2015-01-25

Degree:Master

Type:Thesis

Country:China

Candidate:L S Hua

Full Text:PDF

GTID:2298330422472587

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

With the rise of microblogging and e-commerce in recent years, the number ofusers and network comments explosive growth. These comments include the judgmentand analysis on the products, hot events and so on. It has a great value and significancefor improving of products and monitoring of public opinion for government. Textsentiment classification is a hotspot in recent years.Text sentiment classification is a binary classification, which determines the text ispositive or negative. Due to the complexity of emotional expression, what part ofspeech of words containing more emotional information and greater help forclassification, it will be discussed in detail in this dissertation.We improve a cross-domain sentiment classification method which combinelearn-based and lexicon-based techniques. Our main work and contributions include:①Investigate the influence of stop words on the text sentiment classification, thestop words is consist of different part of speech words. We have a detail experiment andanalysis on the lexicon-based and learn-based techniques using seven kinds of stopwords and three domain of corpus. The result is that for the lexicon-based method, usingthe stop words except adjectives, adverbs, verbs obtain a better result in general, whilethe stop words used in the traditional subject classification has little or no effort onsentiment classification and for the learn-based method, adjectives, adverbs, verbs andnoun is more important and do not use any stop words obtain the best result.②Improve a cross-domain sentiment classification method which combinedlearn-based and lexicon-based techniques. Generally the approach of Chinese textsentiment classification is based on the sentiment knowledge or the feature selection.The previous one do not need labeled text, it’s simple and easy to implement, but it hasa low accuracy. The latter one has a high accuracy, but it need a lot of labeled text whichis not well for cross-domain sentiment classification. Tan et al propose a novel schemefor sentiment classification which combines the lexicon-based and learn-basedtechniques. It do not need any labeled text but has a good result. In this dissertation weuse a sentiment lexicon constructed by PMI (Point Mutual Information) algorithm toreplace the corresponding part of the original algorithm. The result show that it has abetter accuracy. After that we have a detail analysis on the result impact and thealgorithm parameters generated.

Keywords/Search Tags:

Text Sentiment Classification, Stop Words, Sentiment Lexicon, NaiveBayesian, Support Vector Machine

PDF Full Text Request

Related items

1	Research On Sentiment Analysis Of Microblog Text Based On Recognition Of Sentiment New Words
2	Text Sentiment Classification Of Hotel Field And The Research On Sentiment Element Extraction
3	Research On Feature Generation Methods For Text Sentiment Classification
4	Sentiment Classification By Combining Lexicon-based And Machine Learning Methods
5	Research On Web Text Sentiment Analysis Method
6	Research On Text Sentiment Classification Based On Deep Neural Network
7	Sentiment Classification With Bilingual Text
8	The Key Technologies’ Research And Implementation About Information Acquisition And Emotion Classification On New Social Network Media
9	Sentiment Lexicon Network For Short Texts Sentiment Classification
10	Research On Sentiment Classification Model Based On Web Comments