Font Size: a A A

Chinese Text Sentiment Analysis Method Based On Text Data Enhancement And ELECTRA Language Model

Posted on:2023-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:H B YuFull Text:PDF
GTID:2568306815968509Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sentiment analysis of Chinese text is one of the important foundations of data mining,which aims to automatically determine the attitude of opinion holders to a certain topic in text.Sentiment analysis of Chinese online comment text can be applied to public opinion monitoring,topic supervision,word-of-mouth analysis and other scenarios.There are two problems in emotional analysis of Chinese online review texts:on the one hand,the expression mode of Chinese online review texts is flexible and semantic expression is complex,which makes it difficult to extract highly discriminative emotional features.On the other hand,there is a "class imbalance" in which negative emotion corpus is far more than positive emotion corpus,which leads to the "imbalance" of model training.To solve these two problems,this paper proposes an emotion classification model based on text data enhancement and ELECTRA language model.The main research works are as follows:(1)In view of the difficulty in extracting highly discriminative emotional features from Chinese online review texts,this paper proposes the Ea Bi LSTM model to strengthen the process of extracting emotional features from Chinese online review texts.Based on the current popular "transfer learning" method,the model reinforces the learning process of text emotional features in the embedding layer and the training layer respectively.Firstly,as an optimization,text features are extracted by ELECTRA model in the embedding layer.Then,in the training layer,the attention mechanism and Bi LSTM are used to extract emotion features and analyze the relevant semantic relationships.Finally,the Softmax classifier is used to classify in the classification layer.The experiment compares the different characteristics of ELECTRA pre-training language model and BERT model,and proves that the Ea Bi LSTM model constructed in this paper can enhance the extraction of emotional features from Chinese online comments.(2)In view of the "imbalance" problem of model training in the "imbalanced scenario",this paper proposes the EDA-Ea Bil STM model based on the Ea Bi LSTM model.The model introduces more prior information into the training of the model through text data enhancement technology.First,for the imbalanced corpus,part of the data is enhanced by EDA text data enhancement technology to balance the corpus(the first introduction of prior information).Then iteratively train the enhanced corpus through the constructed combination model(based on ELECTRA)to extract emotional features(the second introduction of prior information).Finally,it is classified through the fully connected layer and the Softmax classifier.Compared with those methods that only use model tuning or text extension,the experiment proves that the idea of introducing priori information twice can get more gain on F1 index,so as to better solve the problem of "model training imbalance".In addition,the experiment compares the effects of comprehensive enhancement strategy and partial enhancement strategy combined with different models.The mean value of F1 value is selected as the evaluation standard to study the "replacement cost" of the generated text compared with the real text in training.The main innovations and contributions of this paper are: Aiming at the "too liberal" comment text of online comments,this paper proposes a new method to enhance the process of text emotion feature extraction in the embedding layer and training layer.This method can improve the accuracy of the emotion classification model.In order to solve the problem of "model training imbalance",this paper proposes an EDA-Ea Bi LSTM model under the idea of "introducing prior information twice".This paper also explores the application rules of pre-trained language models in "migration learning",and uses EDA enhancement technology as an entry point to conduct a more in-depth study on text data enhancement.Figure [25] Table [11] Reference [84]...
Keywords/Search Tags:ELECTRA pre-training model, text sentiment classification, text data enhancement, attention mechanism, BiLSTM
PDF Full Text Request
Related items