Research On Text Style Transfer Based On Delete-Retrieve-Generate Framework

Posted on:2023-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Fei

Full Text:PDF

GTID:2568306836476614

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the continuous development of information technology,a large amount of text data has appeared on Internet platforms such as ecommerce websites and social media and has shown an exponential growth trend.In recent years,Text Style Transfer(TST)has aroused extensive interest of researchers and has become one of the most popular topics in the field of natural language processing.This task aims to transfer the specific style(e.g.,sentiment,tense,gender,etc.)of the text through editing or generating in the premise of preserving the text content.There are many challenges in TST research: 1)Lack of parallel corpus,thus failing to adopt the robust sequence-to-sequence model which highly relies on the parallel training data;2)The content and style in the text are often entangled,so it is difficult to separate their representations in the latent space;3)As TST is a relatively new research topic,there is currently no unified evaluation metrics.Regarding the issues above,this paper systematically carries out the following works:Firstly,this paper reviews the existing TST methods systematically.According to the parallelity of training data,The existing methods are divided into three categories,i.e.,supervised-learningbased methods,unsupervised-learning-based methods and semi-supervised-learning-based methods.Among them,the TST methods based on unsupervised learning are the mainstream research methods,which can be further divided into implicit methods and explicit methods.Implicit methods aim to learn and distinguish latent representations of text content and text style,including disentanglement strategies,back-translation strategies,pseudo-parallel corpus strategies,etc.;explicit methods aim to separate text content and text style from the text itself,including deletion strategy based on word frequency,deletion strategy based on attention mechanism,and deletion strategy based on combination of word frequency and attention mechanism.Secondly,in order to better separate text content and text style,this paper proposes a probabilistic variation-based attribute word deletion strategy on the basis of a delete-retrieve-generate(DRG)based TST framework.Firstly,attribute words are located and deleted according to the probability change of the sentence style predicted by the style classifier,thus leaving the text content;Then,the sentences similar to the text content are retrieved from the target corpus by calculating their cosine distance;Finally,the target attribute words are extracted from the sentence and synthesized with the text content to generate the target sentence.The experimental results show that our method outperforms the baselines in terms of text content conservation,and also has good performance on transfer accuracy and text fluency.Finally,in order to guide the model to generate more fluent target sentences,this paper combines the DRG framework with the reinforcement learning method where the fluency reward is introduced besides the existing transfer accuracy reward and the content conservation reward.The method consists of three modules: neutralization module,generation module and reward module.Among them,the neutralization module is to explicitly filter attribute words to extract the text content,the generation module is to combine target attribute words and the text content to generate the target sentences and the reward module is to further optimize the neutralization module and the generation module.The experimental results show that our method has higher fluency than the baselines and also can well balance the transfer accuracy and the text content conservation.

Keywords/Search Tags:

Text Style Transfer, Natural Language Processing, Style Classifier, Reinforcement Learning

PDF Full Text Request

Related items

1	Research On Text Style Transfer Based On Seq2seq Framework
2	Text Style Analysis For We Chat Articles
3	Research And Implementation Of Image Style Transfer Algorithm Based On Deep Learning
4	Research On Feature Space Backdoor Attack Methods For Natural Language Processing Models
5	Research On Non-parallel Text Style Transfer Based On Deep Learning
6	Research On Text Style Transfer With Deep Learning
7	Research On Style Transfer Technology Based On Deep Learning
8	Text Style Transfer Based On Prompt Learning
9	Research On The Style Transfer Of Landscape Paintings Based On Deep Learning
10	Chinese Text Style Transfer Based On Cross-Alignment