Font Size: a A A

Chinese Sentiment Orientation Classification Based On Emotional Word Set

Posted on:2018-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:L X WangFull Text:PDF
GTID:2358330518960498Subject:Communication and Information Engineering
Abstract/Summary:PDF Full Text Request
Sentiment classification generally refers to the automatic classification of text polarity,such as:positive,negative,neutral.In the era of big data it's mainly used to investigate the public attitudes towards the event,person or group.In the past,it's will spend a lot of time and have great limitations.Now,according to the vast amount of information on the Internet we can more rapid and convenient get the views of others,and based on these large amounts of information derived from the reliability tends to be higher.This paper analyzes the traditional Chinese sentiment classification based on the sentiment lexicon,and conducts the experiment with ICTCLAS segmentation and HowNet sentiment lexicon.The experimental results were analyzed and found that regardless of what kind of segmentation tool and sentiment dictionary will cause some uncertainty interferenc.Especially the different sentiment dictionary there are great differences in the reliability analysis and classification.In view of the above situation,this paper propose the concept of "sentiment character set",which is not only related to the use category,but also does not need Chinese word segmentation.So we first to find an sentiment character set:these words themselves can influence the emotional tendency of vocabularies,or word itself with strong emotional tendencies.This paper will give two different versions of the "sentiment character set".With the two versions sets,we choose the having superior experimental results version and improve method of emotional value from next paragarm.Because there is no vocabulary,so we sort out the commonly used "privative word set"and "degree word set".At the same time the "privative word set" and "degree word set"affection for the sentiment words will be added to the experimental algorithm.Computed text is based on each word emotion value,and all the words are completely independent.But some special phrases,which are split,may affect the sentiment of the sentence.In this paper we uses the maximum forward matching method to identify these words.At last,the information entropy of the same type of word is reduced by searching the inter word association,which improves the accuracy of the experiment.At last we can increases this accuracy by 20%.
Keywords/Search Tags:Positive, Negtive, Sentiment Character Set, Chinese Text Classification
PDF Full Text Request
Related items