| Natural Language Generation is an important branch of computational linguistics and artificial intelligence.The existing language generation system has redundant structure and the problem of high human participation is urgently needed to be solved.Since the Generative Adversarial Network is gradually attracting attention in the direction of sequence generation,and in order to better make the computer have same ability of human expression and writing,the goal-directed generation method provides a new generation angle,which is different from the linguistic analysis inverse process to improve the controllability of the generated content.In existing policy gradient algorithm for sequence generation,the reward rules provided by the real environment are not well approximated,that is,the data generated by the training model itself is used as the result of the environment model generation,and is used as a feedback reward to participate in the calculation.The discriminator fails to provide a reward for generating the target,and the generator performs training according to the method of reinforcement learning,so that language generation model in the confrontation generation network cannot be trained based on target word guidance.The Goal-directed Sequence Generative Adversarial Network(G-SeqGAN)model proposed in this paper mainly implements the combined design of the reinforcement learning method and Generative Adversarial Network,and achieves the goal through reward guidance and using the rolling calculation method.It further enhances the controllability of generating tasks against the network.It analyzes the environmental elements that is easy to be ignored when using reinforcement learning method,summarizes the private state of it,provides a new analytical perspective for method,and provides a new update expression for the training of the model,through the new The feedback strategy ensures that the feedback given by the environment is better estimated during the training process,thereby improving the training effect.In this paper,comparison experiments are carried out on synthetic data and real-valued tasks.The experimental results show superiority of model in terms of practicability and robustness.The advantages of word control sequence generation method are proved,simulated feedback method is superior to sequence generation adversarial network model. |