Font Size: a A A

Research And System Construction Of Cross-domain Sentiment Classification Method Based On Generative Adversarial Networks

Posted on:2022-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y L HuangFull Text:PDF
GTID:2518306311953649Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the beginning of Web 3.0,more and more users surf the internet.Analyzing the sentiment polarity of texts published by users on the Internet(e.g.,Weibo,Taobao,Blog)is helpful to understand user preferences for companies and helps governments understand the attitudes of Internet users.It also helps individuals make good decisions.Therefore,analyzing the sentiment polarity in the text has attracted more and more attention from researchers in the domains of natural language processing,artificial intelligence and other domains.When analyzing the sentiment polarity of text in a single domain,machine learning can be used to get the sentiment classifier.However,the text posted by users on the Internet involves many domains.The data distribution of samples in different domains will be different.In addition,there will be new domains in the Internet,the data of the new domains have not been labeled,and the labeled data is expensive and time-consuming.This makes it difficult to use a classifier trained in a single domain with sufficient data to directly apply to other domains.Therefore,a cross-domain sentiment polarity classification method based on transfer learning theory has emerged.This scheme can effectively reduce the difference in data distribution between domains,has an important role in training a sentiment classifier that can apply to other domains,and is of great research and application value.The reasons for the differences in data distribution between domains are:sparsity,polysemy,feature divergence,and polarity divergence.In order to reduce the difference in data distribution between domains,the paper mainly uses feature-representation-transferred ways to reduce the difference in data distribution between domains and realize the transfer between domains.The main research works and innovations of the thesis are as follows:(1)This paper proposes a cross-domain sentiment classification method based on BERT word embedding.The method uses mutual information method that is modified to select domain-independent features,and uses a spectral clustering theory for feature alignment.When selecting domain-independent features and feature alignment,it takes the contextual information into consideration based on BERT word embedding to reduce the impact of sentiment words with polarity divergence.Then,it augments the source domain features with features learned by feature alignment,and generate the unified clusters to reduce the distribution difference between the source domain and the target domain.Experimental results illustrate that the model proposed in this paper is effective for cross-domain sentiment classification.Compared to WEEF model and DANN model,the average accuracy of the proposed method is increased by 2.59%and 1.46%,respectively,and the F1-score is increased by 2.03%and 1.02%.(2)This paper also proposes a cross-domain sentiment classification model based on Generative Adversarial Networks and improved text convolution neural network.The model uses the BERT model to represent the source and target domains,and uses an improved text convolutional neural network to further process the word vector matrix obtained by the BERT model to obtain key information in the text.The features of the source domain processed by the BERT model and the improved text convolutional neural network are input into the sentiment polarity classifier,and the sentiment polarity classifier is trained to minimize the loss of sentiment polarity classification.The features obtained by the BERT model and the improved text convolutional neural network are input into the domain classifier to maximize the loss of the domain classifier,and the gradient inversion layer is used to make the features obtained by the improved text convolutional neural network have domain independence and the category can be distinguished,so as to reduce the data distribution difference between the domains and realize the transfer between the domains.The experimental results illustrate that the proposed model can effectively classify the sentiment polarity in the new domain.Compared with mainstream models,the proposed model improves the accuracy of cross-domain sentiment classification.(3)In order to improve the application value of the cross-domain sentiment classification method,a cross-domain sentiment classification system based on improved BCN is constructed.The system can label the sentiment polarity of the existing domain,and at the same time,it can also use the system to label the emotional polarity of the data in the new domain with the help of the data in the existing domain.In order to improve the maintainability and scalability of the system,a three-step re-architecture method is used to analyze the system's architecture to further promote the application value of the cross-domain sentiment classification method.For cross-domain text sentiment classification,the WE-BERT and improved BCN model proposed in this paper reduce the difference in data distribution between domains and improve the classification effect.In addition,the cross-domain sentiment classification system based on the improved BCN model constructed in this paper is convenient for users to analyze the sentiment polarity of the text data,and improves the application value of the cross-domain sentiment classification method.
Keywords/Search Tags:Text sentiment polarity classification, Cross-domain, Data distribution divergence, Feature alignment, Domain adversarial
PDF Full Text Request
Related items