Font Size: a A A

Research On Text Generation Technology For Structured Data

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChenFull Text:PDF
GTID:2428330611999615Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Text generation technology for structured data is one of the most advanced research tasks in natural language generation field.It aims to give structured data and generate the corresponding text describing the data.With the popularization of social information technology,Internet data is growing explosively.It takes a lot of time and manpower to write the corresponding text,while the text generation technology for structured data can effectively improve production capacity and efficiency.However,the research of text generation technology for structured data is less,and the achievement is not enough now.Therefore,it has great research value and practical significance to carry out the research of text generation technology for structured data.This paper focuses on the topic of structured data oriented text generation technology,and carries out three sub topics.Based on the data of numerical value representation,we propose a pretraining module,in order to give a better difference between numbers and text representation,we carry out a research on the characteristics of text generation technology for structured data,put forward a random masking part of the data,require the model to generate and calculate the equation of the data,so as to restore the pretraining task of the hidden data,and effectively improve the model for the ability of common sense,logic and grammar modeling of data itself has obvious experimental effect.In order to further improve the digital expansion and reasoning ability of the model,the data to text generation technology module based on digital expansion and reasoning introduced the multi task learning mechanism,added the equationdecoder of reasoning number in the original encoder decoder model,embedded it into the original(text)decoder,and triggered by setting the reasoning button.When the model generates text,if the current number needs to be inferred,trigger the inference button,enter the equation decoder,and use the generated equation to calculate the corresponding result and return it to the text decoder.At the same time,by introducing reinforcement learning to explore and reward according to the existing trend of number characteristics,the accuracy of the reasoning numbers is effectively improved.In order to improve the ability of recognition and selection of key data,an adversary model based on knowledge distillation is proposed.In the design of the model,in the distilled "teacher network",the label of whether or not each data appears in the generated text is significantly added,the table data expressed in the tuple of triple is changed into quadruple,the discriminator is introduced to guide knowledge distillation,and the "student network" learns the encoding ability of selecting key data from "teacher network".Through knowledge distillation,the "student network" improves the ability to select key data and effectively improves the quality of generated text.
Keywords/Search Tags:structured data, pretraining, data reasoning, reinforcement learning, knowledge distillation
PDF Full Text Request
Related items