Font Size: a A A

Research On Text Sentiment Classification Algorithm Based On Bidirectional Long Short-term Memory Network

Posted on:2022-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YuFull Text:PDF
GTID:2518306542462944Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of Internet technology has brought tremendous changes to people's lives.More and more users are accustomed to expressing their views and opinions on the Internet.Offline communication has evolved into online interaction.At the same time,this lifestyle has brought a huge amount of text data.These texts contain subjective sentimental colors.By analyzing and researching them,it is possible to understand users' personal tendencies,obtain practical feedback on products,and understand social hot topics,etc.,which have high commercial value.How to categorize and sort these texts,which are increasing in number and contain subjective sentimental colors,has become very important.Therefore,text sentiment classification has become a major research hotspot in the current text information mining,and it is an important topic in the field of natural language processing.At present,deep learning has brought better development for text sentiment classification by virtue of its high-level feature representation ability.The text sentiment classification method based on deep learning can not only automatically mine the deep features of the text,freeing researchers from manual feature engineering,but also represent the text as a set of low-dimensional dense text vectors after learning the semantic information of the text.Among the various neural network models used to solve the problem of text sentiment classification,the bidirectional long short-term memory network(Bi LSTM)has been widely used because it can better capture text context information from both the forward and reverse directions,but the existing Bi LSTM-based text sentiment classification method ignores the importance of user review habits.In addition,because document-level text is a multi-granular structure with word granularity and sentence granularity,there are certain differences between the functional relationship between words and the semantic relationship between sentences,so how to apply different attention mechanisms to different granularity in the encoding of text is also an existing challenge.Therefore,in view of the above problems,this paper studies the Bi LSTM-based text sentiment classification algorithm from two aspects: the importance of user review habits and the application of different attention mechanisms.The main work of this paper includes the following:(1)This paper first introduces the application background and research significance of text sentiment classification,and then fully investigates the current research status of commonly used methods based on machine learning and deep learning,analyzes the advantages and disadvantages of each type of method,and then emphasizes the deficiencies of the text sentiment classification algorithm based on Bi LSTM,and to analyze these deficiencies.(2)Aiming at the importance of user review habits to user sentiment analysis,a text sentiment classification algorithm(HUSN)based on user review habits is proposed.The algorithm first divides the reviews posted by the same user into historical reviews and target reviews through text preprocessing,and then maps the words in each historical review to word vectors and send them to Bi LSTM to obtain the sentence representation of each sentence.Then multiple sentence representations are sent to Bi LSTM to obtain the document representation containing the user review habit corresponding to the historical review.In the process of obtaining the sentence representation and the document representation,the attention mechanism is used to pay attention to the words and sentences with rich sentimental information.Finally,the similarities between the target review document representation and multiple historical review document representations are calculated,and the sentiment rating corresponding to the historical review document representation with a higher similarity ranking is selected,and the multiple sentiment ratings are averaged and fine-tuned as the predicted sentiment rating of the target review.The experimental results on the three public document-level review datasets of IMDB,Yelp2013 and Yelp2014 show that the HUSN algorithm can effectively improve the classification performance by using user review habits.(3)Aiming at the problem of applying different attention mechanisms to different granularities,a text sentiment classification algorithm(MGA-M)based on multi-granularity attention mechanism is proposed.Firstly,after considering the characteristics of word granularity and the advantages of the basic attention mechanism,the words in the document are mapped to word vectors and sent to Bi LSTM to learn the context relationship at the word level,and the basic attention mechanism is used to capture important words to obtain the sentence representation with rich information.Then,after considering the characteristics of sentence granularity and the advantages of self-attention mechanism,the self-attention mechanism is applied to multiple sentence representations to capture the semantic relationship between sentences,and finally the high-level document representation obtained is used as text features for classification.This algorithm not only has strong interpretability,but also experimental results and result analysis show that it has the best effect compared with other comparison algorithms.
Keywords/Search Tags:text sentiment classification, BiLSTM, user habits, attention mechanism, multi-granularity
PDF Full Text Request
Related items