Font Size: a A A

Neural Network-based Inference Algorithm For Constrained Text Generation

Posted on:2022-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2518306551470544Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Humans are distinguished from other creatures by their unique natural language,and people have never stopped studying it throughout the ages.In particular,the emergence of large-scale pre-training models such as GPT-2 and Bert in recent years has given unprecedented enthusiasm to the research in the field of natural language generation.Natural language generation is restricted,and different genres or contexts have their own unique constraints,so restricted text generation has become an inevitable requirement in the industry.Deep learning methods often need plenty of relevant data,which is tedious to organize,and the small amount of data is not enough to make the neural network converge.In addition,large-scale pre-trained models are difficult to generate domain-specific or lexically specific text,and neural network models with hundreds of millions of parameters are still a black box for people,and they are not controllable,with the potential to generate sensitive words or politically oriented statements,which is not allowed in Industrial fields.To address the above problem,it is easy to think of a generation approach that uses style migration and fine-tunes the model with specific data,thus limiting the generation to some extent,but this is not sufficient to solve some strong constraints.In contrast,this paper focuses on inference and uses specific inference algorithms in decoding to satisfy the constraints of restricted text,and achieves good results under two different instances,which shows that it is feasible to solve the generation of restricted text from inference algorithms.The main contributions of this paper are as follows:1.Proposed the first Chinese palindrome poem generation model(CPPGM).In the absence of any palindrome dataset,we use a language model and several Seq2Seq models trained with common verses,combined with two unique Beam Search algorithms proposed in this paper,successfully completed the generation of Chinese palindrome poems,and the generated verses are readable and poetic.In addition,the algorithm is also applicable to other autoregressive models and has good extensibility.2.A new inference algorithm based on Bert combined with Gibbs sampling is proposed under the premise that the generated text must satisfy the requirement of having a specific word and not including sensitive words.The input is masked off at decoding time to obtain the probability of occurrence of the desired word at each position.In contrast,the commonly used methods are often based on prior knowledge,or traversing the cases where a particular word appears in each position and then using Beam Search for inference in each case.Relatively speaking,the method proposed in this paper greatly reduces the search space and decreases the time complexity.Experimental results show that the generated text is rich in diversity and coherent in utterance.
Keywords/Search Tags:Constrained Text, Textual Inference, Natural Language Generation, Beam Search
PDF Full Text Request
Related items