Research On Generative Dialogue System Based On Reinforcement Learning

Posted on:2024-03-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Yan

Full Text:PDF

GTID:2568307061471964

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

The dialogue model based on deep learning has made a breakthrough,but there are still some problems,such as generic replies,lack of personalized replies and so on.The dialogue system based on reinforcement learning learns the optimal dialogue strategy by interacting with the user,so as to improve the performance of the dialogue system.In the aspect of algorithm,the REINFORCE algorithm is improved,and the problem of long training time of generative dialogue model is improved.From the performance of the dialogue system,the dialogue system to solve these problems.The specific research contents can be summarized as follows:(1)In order to increase the diversity of response,diversity cluster search is used as decoder,and self-evaluation sequence training is used to reduce the high variance of strategy gradient.The improved REINFORCE algorithm uses only one network in the training process compared with Advantage Actor Critic algorithm,which also saves the time of model training.Various types of filters are designed in the Data pre-processing stage to allow the corpus to be explored in a variety of ways.Based on the analysis of the results of manual and system evaluation,the dialogue system performs well in response diversity,and the improved reinforcement learning algorithm solves the general response problem and security response problem to some extent..(2)In order to solve the problems such as universal reply and time-consuming model training,when the dialog system generates replies,the author puts forward some suggestions.In order to increase the diversity of response,diversity cluster search is used as decoder,and self-evaluation sequence training is used to reduce the high variance of strategy gradient.The improved REINFORCE algorithm uses only one network in the training process compared with Advantage Actor Critic algorithm,which greatly reduces the complexity of the network model.In order to diversify and explore the corpus,several types of filters were designed in the Data pre-processing stage,and the experimental time was recorded.(3)In this paper,an attention-based hierarchical recursive encoder decoder model is used to solve the problem of anti-personalized responses in current dialog systems.Userspecific information during a conversation is often valuable because it directly relates to the content and style of the user’s responses,which can further affect the chat process and the user experience.The hierarchical recursive encoder decoder network model can decompose the conversation into two levels,which fully considers the long-term background and the specific information of users.Compared with the current RL model,the dialogue quality of RL-ahred model has been improved obviously.

Keywords/Search Tags:

Dialogue system, Reinforcement learning, Universal responses, Personalized responses, Dialogue strategy

PDF Full Text Request

Related items

1	Research And Application Of Self-dialogue In Dialogue Systems Based On Reinforcement Learning
2	Research On Knowledge Driven Human-machine Active Dialogue Strategy
3	Sample Augmentation Based Reinforcement Learning For Dialogue Management
4	Optimizing Of Dialogue Policy In Human-computer Spoken Dialogue System Based On Reinforcement Learning
5	A Research On "Dialogue" In CCTV's Dialogue Program
6	Research On Dialogue Generation With Multi-Information Fusion
7	Research On The Key Technology Of Task-Oriented Dialogue Policies Based On The Deep Reinforcement Learning
8	Research On Task-based Dialogue Strategy Based On Reinforcement Learnin
9	Research On Key Technology And Application Of Task-oriented Dialogue System
10	Research And Application Of Dialogue Management In Task-based Dialogue System