Font Size: a A A

The Research Of Multi-turn Conversation Response Retrieval Based On Deep Learning

Posted on:2022-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:X C RenFull Text:PDF
GTID:2518306524493664Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
As an important research branch of natural language processing,dialogue system has been paid more and more attention in recent years.The emergence of massive data and the rapid development of deep learning provide important support for the modeling of di-alogue system.Dialogue system can be generally divided into task-oriented and retrieval-based,task-oriented dialogue system needs to identify the user's intention from the dialogue and complete a specific task;retrieval-based dialogue system needs to select the most matching response candidate from the response template library according to multi-turn of historical dialogue combined with deep matching model.The response selection performance of retrieval-based dialog system is easily affected by the quality of datasets,and the current deep matching model still needs to improve the semantic understanding of dialog.Aiming at the above problems,in order to improve the response selection performance as the starting point and research motivation,this thesis studies the retrieval-based dialogue systems from the perspective of data and modeling.The main work and contributions are as follows:From the perspective of data,through the analysis of large-scale multi-turn dialogue datasets,we found that they have the problems of low sample quality,lack of semantics and topic switching.This thesis propose Multi-turn Conversational Augment Strategy(MCAS),in order to improve the quality of conversation samples and mitigate the influence of noise samples.MCAS starts with long rounds of dialogue samples,adopts truncated construction of positive samples,and random and cross construction of negative samples.Based on the existing deep matching model,experiments shows that without design a complex matching model,the performance of response selection can be improved only by modeling data itself,which also provides a new idea for the research in this field.From the perspective of model,this thesis introduces the pretraining language model into the retrieval-based dialogue system,aiming to make use of its powerful semantic understanding ability for massive data.Based on BERT and ELECTRA,this thesis proposes domain adaptation training and fine-tuning strategy.Domain adaptation training can make use of multi-turn dialogue corpus to construct the next sentence prediction task and masked language model,and make the matching model integrate more indomain knowledge of specific fields.In the fine-tuning strategy,in order to capture the conversation information of different talkers,this thesis proposes a new segment embedding representation method based on talkers.Through the experiments in three multi-turn conversation datasets and the verification of various evaluation indexes,it shows that the combination of pre-training language model can greatly improve the response selection performance,and also proves the feasibility and effectiveness of the proposed scheme.
Keywords/Search Tags:retrieval-based dialogue system, response selection, multi-turn conversational augment strategy, pre-training language model
PDF Full Text Request
Related items