Font Size: a A A

Research On Sentiment Analysis Based On BERT-BiLSTM Adversarial Training

Posted on:2022-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:X F YuFull Text:PDF
GTID:2518306548966869Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
As one of the four core technologies in the field of artificial intelligence,natural language processing technology has been developed and applied unprecedentedly,and its sub domain emotion analysis has attracted the research of the majority of scientific research enthusiasts.Meanwhile,with the development of deep learning in natural language processing,there are many models such as BERT that can be used for sentiment analysis.The novel coronavirus pneumonia comment data of Kaggle competition platform is analyzed by BERT and BiLSTM model in the way of adversarial training.The specific research work is as follows.(1)The output eigenvector of BERT is improved.Because the BERT model is composed of 12 layers of encoder with the same structure but different parameters,each layer of encoder can extract the corresponding text information.When the traditional BERT model is applied to downstream tasks,it usually only uses the information extracted by the last layer of encoders as the output feature vector.This paper improves the output feature vector,and takes the result of attention operation of the feature vectors extracted by the last layers of encoders as the output feature vector.Then the improved output eigenvector is used as the input of BiLSTM.(2)The training method of text is improved.In order to improve the robustness and generalization ability of the model,we usually use data enhancement to add interference,but this method is static,once put into the model for training,the data itself will not change.In this paper,we propose a method of adding disturbance dynamically.The disturbance is calculated by gradient,and the disturbance is added to the input text vector(Embedding)to get the adversarial samples,which are then put into the model for adversarial training.There are two ways to add disturbance:one is fast gradient method(FGM),which only adds one iteration in the disturbance range at a time;the other is projected gradient descent(PGD),which only adds one point in the disturbance range at a time,and then adds disturbance for multiple iterations.(3)Two experiments are carried out in this paper,which are multi model adversarial training comparative experiment and multi parameter comparative experiment.In the contrast experiment of multi model adversarial training,we first compare the effects of various deep learning models under non adversarial training,FGM and PGD methods,then take the text features extracted by the last layers of encoders of BERT to do attention operation,and put the results into BiLSTM for training.This process also uses the above three training methods.The second experiment was based on BERT-BiLSTM model,using different batch size and learning rate to train the model,so as to select the most appropriate parameters for the model.Finally,through experiment 1,it is found that the best result is obtained When the model is BERT-BiLSTM that takes three layers encoders for attention operation and the adversarial training mode is PGD,and the classification accuracy is 88.89%.The second experiment found that when the batch size is set to 32,learning rate is set to 5×10-5,the BERT-BiLSTM model can achieve good results.
Keywords/Search Tags:Transformer, BERT, adversarial training, sentiment analysis
PDF Full Text Request
Related items