Font Size: a A A

Research On Text Emotion Classification Based On BERT Embedding

Posted on:2021-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:J H WangFull Text:PDF
GTID:2518306032479194Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the information age,more and more people are keen to express their opinions on social networks,which has led to an exponential increase in global information.How to extract people's emotional tendencies from this information has become an urgent problem.Driven by this kind of environment,sentiment analysis technology emerges from time to time.It is widely used in product evaluation,public opinion analysis,recommendation system and other fields,and has high research and application value.The key in sentiment analysis technology is the construction of sentiment classification model.The traditional sentiment analysis method is based on the sentiment dictionary.This method relies heavily on the construction of the sentiment dictionary,and the generalization ability of the model is not high.Subsequently,machine learning algorithms such as Naive Bayes and Support Vector Machines were used to construct the classifier.Although some results were obtained,these methods could not extract deep-level semantic information and were difficult to train.The development of deep learning has brought new opportunities to sentiment analysis technology.This paper draws on the relevant algorithms of deep learning to further optimize the problems in the current emotion classification model.The research content is as follows:(1)From the perspective of word vectors.The most widely used word vectors are currently represented as Word2Vec word vectors and Glove word vectors.Although these two word vectors show good results,they have the disadvantage of not being able to resolve the polysemy.The text borrows from the BERT pre-trained language model,because during the training process,the model uses a strategy of random Mask partial words.The model predicts the words that are Masked by learning the context information.At the same time,based on the word vector,it stitches again.The word position vector and sentence pair vector are used as the input vector representation of the embedding layer,which solves the phenomenon of polysemy.Then,through comparison experiments of word vectors,it is verified that the method of using BERT word vectors can improve the classification performance of the model.(2)In view of the shortcomings of RNN networks that cannot be trained in parallel,and further improve the performance of the model,this paper proposes a two-way sliced neural network model based on multi-head self-attention.This model borrows the characteristics of sliced neural networks.By cutting the original sequence,each sliced subsequence can be trained in parallel to reduce the training time of the model.At the same time,a multi-head auto Attention mechanism,which can learn hidden information in different subspaces,and extract connections between words in a sequence.Through comparative experiments,it can be seen that the model proposed in this paper can significantly improve the training speed of the model under a certain classification accuracy rate,which verifies the effectiveness of the model in this paper.
Keywords/Search Tags:Emotion Analysis, BERT Word Vector, Slice Neural Network, Multi-Head Self-attention Mechanism
PDF Full Text Request
Related items