Font Size: a A A

Research On Sentiment Embedding Model Based On The Value Of Sentimental Word Intensity

Posted on:2020-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q MengFull Text:PDF
GTID:2428330575961969Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text sentiment analysis is one of the important challenge in the rapid developing research field of natural language processing(NLP),which uses a series of methods to analyze,process,summarize and infer the emotional elements in subjective texts.Chinese sentiment analysis is an important research area in the field of text sentiment analysis.The approaches of Chinese sentiment analysis can be divided into two branches: sentiment lexicon-based and machine learning-based.Sentiment lexicon-based methods use the knowledge base and corpus to construct an emotional dictionary with emotional tendency,by exploring the semantic relationship between different words.These methods divide a word into the binary code(positive,negative)or ternary code(positive,negative and neutral).This practice makes the constructed dictionary not elaborate enough and has different intensity extremes for the task of emotional intensity subdivision.Thus,sentiment lexicon-based methods can not be used for fine-grained emotional analysis.Machine learning-based approaches use the neural network to convert the words into meaningful word vectors.They complete the text sentiment analysis by calculating the cosine similarity between two pre-trained word vectors(e.g.,Word2 vec and GloVe).However,the existing context-based word vector training methods may lead the words with opposite emotional polarity having similar vector space representations(eg,cosine similarity of "gentle" and "unruly" is 0.670398235),which finally reduced the performance of sentiment analysis.In this paper,an sentiment embedding model based on the intensity extreme value of emotional words is proposed by combining the sentiment dictionary with the word vector space model.We start from the sentiment dictionary,and then calculate the fine-grained emotional intensity score of each word based on the combination between the sentiment dictionary and the cosine similarity of the word vector.Finally,we use this intensity extreme value to optimize the pre-training word vectors to make the semantically and emotionally similar words closer in the vocabulary(words with similar emotional polarity values are close to each other,conversely,words with the opposite emotional polarity values are far from each other).The experimental results show that the optimization model proposed in this paper can provide more accurate refinement scores for the sentiment lexicon-based sentiment analysis tasks.Also,by using our methods,the traditional word embedding model can be greatly improved.Compared with the original word vector,the probability of the emotional antonym in the embedding vector is greatly reduced,and the higher accuracy of sentiment classification can be achieved.
Keywords/Search Tags:Sentiment Dictionary, Intensity Value, Fine-grained, Word Embedding, Cosine Similarity
PDF Full Text Request
Related items