Font Size: a A A

Research On Non-parallel Text Style Transfer Based On Deep Learning

Posted on:2022-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:M X HuFull Text:PDF
GTID:2518306335997699Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Natural language processing(NLP)is one of the main research directions in artificial intelligence,and the research on natural language processing has received a lot of attention from researchers.In recent years,NLP has made great breakthroughs in machine translation,information retrieval,text classification,and text generation.Among them,because of the control of some attribute in text generation,the technology of text style transfer,which has been discussed and studied by many scholars since 2017,has become a research hotspot in text generation and NLP.Currently,parallel-style text is fairly limited;however,non-parallel corpora cannot provide sentence pairs that keep the same content but carry inconsistent described styles,therefore we cannot build the style mapping training to complete text style transfer.And in the existing work for non-parallel text style transfer,most focus on the promotion for the accuracy of style transfer,but ignore the preservation for the content of sentences,however the content information in the longer sentences is easy to lose in the transfer process.1.Aiming at the lose problem of the original content in sentences,we first train a content discriminator,which regards the bags-of-words feature as the content information representation vectors,to extract the content information vectors of original sentences and the transferred sentences;then we build the adversarial and generative network – WGAN-gp to force the content information,contained in the vectors of them,keeping unanimous.In the experiment,the results of relative automatic evaluation and manual evaluation show that the framework TWPst,which contains the content preservation model that based on WGAN-gp,is more efficient than the current mainstream frameworks.2.In addition,aiming at the low feature learning efficiency in unparallel and limited corpus,the second framework(DAAst),we introduce domain adaptation learning to explore whether the data of source domain(the other domain)is beneficial to the feature learning in target domain,and then based on the encoding process,we increase an adversarial network which would exclude unexpected original style information of the latent(semantic)vectors of sentences.In the process of generating the style information for the target sentences,DAAst takes advantage of the attention mechanism of the self-attention model that give the relevance weight of possible generated words about the target style,so as to effectively help the relative generator to concentrate generating the higher weight words.Through related experiments and analysis,the second framework is obviously better than the previous frameworks in protecting the content information,and in the situation that shrinking the training data of the target domain,DAAst can still effectively retain the original content information in the transferred sentence(the relative codes of the second framework have been published in https://github.com/mingxu an007/text-style-transfer-with-adversarial-network-and-domain-adaptation).
Keywords/Search Tags:text style transfer, Text CNN, WGAN-gp, adversarial network, attention mechanism
PDF Full Text Request
Related items