Text Classification Based On Deep Transfer Learning

Posted on:2020-12-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y W Xi

Full Text:PDF

GTID:2428330599975631

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Text classification based on deep learning is a supervised learning task that relies heavily on tagged data.High-quality,large-scale tagged data sets are often difficult to obtain,and the cost of manually tagging data is too high.To solve this problem,this paper studies the application of transfer learning,reducing the dependency on target domain tag data by utilizing unmarked data and tag data in related fields.This paper proposes improvements on both task-based and domain-based transfer learning on LSTM text classification model.The specific research works of this paper are as follows:(1)The LSTM text classification model and transfer learning technology are introduced.The paper analyzes the structure of seq2 seq autoencoder model and the advantage of using it as a source task in deep transferring method.The cross-domain transfer learning method is studied as well.(2)Since seq2 seq autoencoder is insufficient of capturing representation features,adversarial perturbations are added to the embedding layer of the model,which makes autoencoder enable to reconstruct text in unsupervised state,instead of duplicating the input text.In addition,this paper applies the Bi-LSTM network on the encoder,which further improves the ability of capturing the semantic features.This autoencoder is named as AdvSA.The experimental results show that using AdvSA as pretrained model for LSTM model,the classification accuracy reached 92.98% on dataset IMDB,82.57% on dataset Rotten Tomatoes.(3)In order to further reduce the dependence of the text classification model on the tag data,an AM-AdpLSTM text classification model based on cross-domain transfer learning is proposed.The model learns rules in related fields and transfers them to the target domain.In this process,by adding an adaptive layer to the model,the transfer loss between source and the target domain is reduced.So there is no need to rebuild the model even if the data distribution changed.In addition,this paper uses the attention algorithm to establish the filtering mechanism between domains.In this way,the model's attention on the source domain is concentrated in those partitions with higher similarity to the target domain.On Rotten Tomatoes dataset,the AM-AdpLSTM model improves the classification accuracy by about 7 percentage points compared with the LSTM model,the disparity between AM-AdpLSTM and LSTM increasing with number of labeled data decreased.

Keywords/Search Tags:

text classification, transfer learning, autoencoder, adversarial training, attention mechanism

PDF Full Text Request

Related items

1	Research On Text Sentiment Analysis Based On Adversarial Training
2	Research On Text Representation And Text Classification Method Based On Adversarial Training
3	Research On Chinese Text Classification And Robustness Based On Attention Mechanism
4	Research On Adversarial Sample Generation And Defease Methods For Text Classification
5	Text Classification Based On Semi-supervised Learning
6	Research On Non-parallel Text Style Transfer Based On Deep Learning
7	Textstance Detection Based On Attention Mechanism And Adversarial Multitask Learning
8	Research On Generation Of Ancient Poetry Based On Adversarial Training
9	Research And Application Of Chinese Text Sentiment Analysis Method Based On Transfer Learning
10	Research Of Knowledge Graph Embedding Adversarial Learning Method Based On Attention Mechanism