Font Size: a A A

The Research On Domain Adaptation Method Based On Autoencoders

Posted on:2022-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:H DingFull Text:PDF
GTID:2518306560954809Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Since domain adaptation methods can effectively solve the problems of obtaining a large number of labels and retraining models in traditional machine learning,they have become a hot research topic in recent years.Domain adaptation methods train a highconfidence classifier for the target domain using potential information from the source domain.Due to their robust feature representation capabilities,autoencoders have become one of the most widely used models in domain adaptation tasks,and they have produced excellent results.Despite the fact that domain adaptation methods based on autoencoders have a high degree of generalization,they still have some limitations in practical applications.For example,marginalized Denoising Autoencoders(m DA)can be used to train new feature spaces.To achieve the potential feature representation of the data,only a linear function is used,and it is difficult to capture the nonlinear relationship between the source domain and the target domain data.It brings the current domain adaptation tasks together to solve problems such as unsatisfactory effects when the distribution discrepancy between the source domain and target domain is too large.It presented a significant challenge.This paper focuses on the classification of text data using the autoencoder model:(1)For m DA,a linear function is used for training,and only nonlinear mapping is used for feature extraction to obtain the data's nonlinear relationship.The captured data features do not accurately represent the data's nonlinearity.This paper proposes Nonlinear cross-domain Feature learning based Dual Constraints(NFDC).When learning the feature representation of the data,this approach uses the kernel function to capture the nonlinear relationship between the source domain and the target domain data.We first introduce the Maximum Mean Discrepancy(MMD),which can measure the distance between source domain and target domain during the training process.,and further reduce the distribution discrepancy between domains.Second,we implemented a widely used regularization method Manifold Regularization(MR).The geometric structure knowledge of the data is stored using MR.The data can be reasonably invariant in spatial location after feature mapping.Finally,the final classifier is constructed from the feature space created by the feature representation and used to classify the target domain.In domain adaptation tasks,the experimental results indicate that this approach outperforms the baseline algorithm.(2)Autoencoder Representation learning guided by Co-training(ARCT)is proposed to address the problem of wide distribution discrepancy between the source domain and the target domain leading to unsatisfactory domain adaptation tasks.To obtain domain invariant feature and domain specific feature,this approach employs Central Mean Discrepancy(CMD).Two classifiers of different categories are obtained through these two feature,and these two classifiers then undergo Co-training.Co-training to generate pseudo-labels in the target domain,then using the pseudo-labels to construct a new classifier and identify the target domain.In domain adaptation tasks,the experimental results indicate that this approach outperforms the baseline algorithm.
Keywords/Search Tags:Domain Adaptation, Autoencoder, Feature representation
PDF Full Text Request
Related items