Font Size: a A A

Research On Chinese Text Sentiment Analysis Algorithm Based On ELMo And Bi-SAN

Posted on:2022-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiFull Text:PDF
GTID:2518306521964359Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Sentiment analysis is natural language task,and belongs to the category of opinion mining,referring to the text data cleaning,handling,extract relevant characteristics calculated judgment text contains the emotional tendencies,provide the basis for the decision-making,is widely used in public opinion analysis,economic development,legal system construction,recommendation system and so on.A large number of researchers have made some achievements in the research of sentiment analysis task,but there are still some challenges.For example,the static word embedding method has deviation due to the polysemy of a word;the traditional convolution and recurrent structure of deep learning network can not process the whole text,which leads to the inadequate feature extraction;most of the existing emotional analysis algorithms only use single word vector as input,the extracted feature is single,which leads to the over dependence on word vector.Aiming at the above problems,the main research contents of this thesis include:1.Aiming at the problem that the deep learning algorithm often uses static word embedding technology such as word2 vec to extract the emotional features of text data,which is caused by polysemy of one word,an emotional analysis algorithm based on improved ELMo language model and Bi-LSTM is proposed.ELMo language model can generate the word orientation which integrates word meaning,syntax and semantics.Secondly,Bi-LSTM is used to extract features in two directions of the text,which effectively improves the accuracy of Chinese text sentiment analysis algorithm.2.Aiming at the problem that CNN can only extract local features and Bi-LSTM coasts more time consuming and incomplete feature extraction due to word processing,a two-way self attention network which combines relative position coding is proposed for emotional analysis.This network can extract the long-term dependence of text and speed up quickly by linear connection between any word.Each word can extract features from all words in the context,and learn from the characteristics that are more critical to emotional tendency judgment.At the same time,the introduction of phase to position coding makes up for the lack of self attention mechanism that cannot learn sequence features.The experimental results show that the proposed algorithm is effective and useable.3.Aiming at the problem that the existing deep learning algorithm only uses a single word vector as input and relies too much on the word vector in feature extraction,this thesis introduces part of speech and emotion dictionary prior knowledge to enrich text features,self attention mechanism is used to extract encodes part of speech and emotion dictionary features and uses gating mechanism to fuse word vector features in research content 2 to highlight emotional features and reduce noise impact.The experimental results show that this way has achieved good results on two kinds of data sets.
Keywords/Search Tags:Sentiment Analysis, ELMo Word Vector, Self-Attention Network, Sentiment Lexicon, Feature Fusion
PDF Full Text Request
Related items