Multi-domain Dialog Policy Learning Based On Multi-agent Reinforcement Learning

Posted on:2023-12-23

Degree:Master

Type:Thesis

Country:China

Candidate:L Tang

Full Text:PDF

GTID:2558307154974499

Subject:Computer Science and Technology

Abstract/Summary:

Task-oriented dialogue system aims to help people complete tasks such as air ticket-booking and hotel-booking,where one of the most important module is dialog policy.The performance of dialog policy determines the success of human-machine dialogue system.The goal of dialog policy is to guide the conversation and help users achieve their goals based on the demands of users and dialogue history information.Recently,reinforcement learning is the main method to learn dialog policy.Dialogue process is regarded as a sequential decision-making process in reinforcement learning,and the key information in the dialogue history is represented as dialog state.Reward function is used to evaluate the dialogue and the decision.Dialog policy that can maximize the expected reward is obtained by exploring in the dialog state and dialog action space.The different scenario involved in the dialogue is called domain.When the number of domains involved in the dialogue increases,the dialog state space and action space will also increase sharply,resulting in difficulties in policy exploration in the reinforcement learning model,and difficulties to get the best dialog policy in limited interactive training.In addition,another problem of multi-domain dialog policy learning is the sparse dialogue data.It is difficult to collect a large number of dialog corpora in some domains,which makes it difficult for dialog policy in the domains to be fully trained,and finally affects the whole dialog policy model.In order to study the dialog policy in multi-domain scenario,The main work and contributions of this paper are as follows:Firstly,aiming at the difficulties of policy exploration in huge dialog state and action space of multi-domain dialogue,this paper proposes to use multi-agent reinforcement learning to model dialog policy in multi-domain scenarios.In reinforcement learning,agent usually represents a policy model that gives decision-making actions.This method partitions the original dialog state and action space into a number of smaller state and action spaces according to specific domains.For each domain,a specific agent policy is trained to make decisions for each turn involving this domain.Compared with the baseline,experiments on the multi-domain dialogue dataset Multi WOZ show that the dialog success rate using this method has increased from 55.0 % to 67.2 %.Second,after the partitioning,there are omly few dialogue data in some domains,which will cause difficulty to fully train the agent policy in these domains.Based on the above multi-agent reinforcement learning,transfer learning is further used to solve the problem of sparse corpus in multi-domain dialogue.Firstly,this method trains a general policy module using the dialogue corpus of all domains,and modifies the policy network according to the action space of each domain,then uses a small amount of data in each domain to fine-tune the policy module to better adapt to the ontology knowledge of each specific domain.Compared with the previous method,experiments on multi-domain dialog dataset multiwoz show that the dialog success rate of this method is further improved from 67.2 % to 76.4 %.

Keywords/Search Tags:

Task-oriented dialogue system, Dialog policy learning, Multi-agent reinforcement learning, Transfer learning

Related items

1	Research And Application Of Dialog Policy Module Based On Multi-Agent Reinforcement Learning
2	Research On The Key Technology Of Task-Oriented Dialogue Policies Based On The Deep Reinforcement Learning
3	Research On Task-oriented Dialogue Policy Based On Deep Reinforcement Learning
4	Building A Chatbot In Healthcare Domain Based On Dialog Policy Learning
5	Research On End-to-End Task-oriented Dialogue System Based On Deep Learning
6	Research On Dialogue Policy Learning In Task-oriented Dialogue System
7	Research On Key Technology And Application Of Task-oriented Dialogue System
8	Dialogue Management In Cognitive Conversational Systems
9	Research On Key Techniques Of Transfer Learning In Task-oriented Dialogue System
10	Sample Augmentation Based Reinforcement Learning For Dialogue Management