Font Size: a A A

The Research Of Text Sentiment Analysis Technology Based On Linguistic Knowledge

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2428330545477039Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With today's highly developed internet industry and various information platforms(such as Taobao,Amazon,and douban)gets better with time,people can express their opinions or sentiment attitude about some interested subjects very easily.Therefore,text sentiment analysis techniques,which enable the automatic mining of the sentiment orientation information from these texts,yield appealing prospects in a wide range of applications,such as obtaining market feedbacks for businesses,improving advertising effectiveness for enterprises,monitoring public opinions for government departments and so on.At present,although quite a lot of in-depth work have been carried out on the topic,there are still some areas that remain to be studied and improved.Firstly,about build-ing sentiment lexicon,the existing methods are based on the distributed word embed-ding.They can automatically construct high-coverage and accurate sentiment lexicons in specific areas without complex feature engineering.However,words with opposite sentiment orientations and semantics often appear in similar contexts,making the em-bedding of these words indistinguishable during training,thus reducing the performance of the lexicon.Moreover,word embedding based methods use seed words with senti-ment labels as the training set,thus the output lexicon suffer from seed words of poor quality.Secondly,about document-level sentiment classification,the attention based neural network method can judge the importance of words and sentences according to their relevance towards classification.They achieve performance improvements by attributing higher attention weights to important words and sentences.However,the currently existing local context attention mechanisms has not yet made full use of the linguistic information that may help to determine the document category,while the ex-isting language knowledge based model is mostly a word-bag,which cannot deal with long text sequences.Therefore,for the above issues,the main contents of this paper are as follows:1.In order to improve the ability of the word embedding to distinguish sentiment classes,a word embedding learning model based on sentiment and semantic contrast information is proposed.By integrating these information into the original word embed-ding,the similarity between the words with same sentiment orientation is strengthened,and the similarity between the words with same sentiment orientation is weakened.We define this integrated output as the sentiment and semantic contrast word embedding(SSCWE),and compare it with other word embedding to prove its validity.2.In order to improve the ability of sentiment lexicon for distinguishing sentiment classes,a method for sentiment lexicon construction based on extending seed words is proposed.Considering coverage,sentiment intensity and the ability for distinguishing the sentiment classes of words in the corpus,this paper extending seed words based on SSCWE.The final lexicon is acquired by training on this extended seed words and SSCWE,and the experiment results in three real-world datasets show the effectiveness of this method.3.In order to improve the accuracy of document sentiment classification,a document-level sentiment classification model based on linguistic knowledge attention mechanism(LKA)is proposed.Firstly,A hierarchical network is built by using Bidirectional Long Short-Term Memory(BLSTM)to capture semantic information in long sentence and long text.Secondly,the sentiment classification model based on linguistic knowledge attention mechanism is proposed based on hidden states of Word level and Sentence level BLSTM,SCLex,negative words and egree words.Finally,the experiment results in three real-world datasets show the effectiveness of this model.
Keywords/Search Tags:sentiment lexicon, word embedding, document sentiment classification, attention mechanism, linguistic knowledge
PDF Full Text Request
Related items