Multi-turn Retrieval-based Conversation Models Based On Large Candidate Corpora

Posted on:2021-02-11

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Lu

Full Text:PDF

GTID:2428330623967814

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the profound changes in artificial intelligence,a new generation of end-to-end chatbots have been widely used in practical scenarios such as entertainment chatbots,personal assistants and intelligent customer service,becoming one of the most promising technologies in artificial intelligence.Unlike traditional human-computer interaction methods,intelligent dialogue systems can not only understand the language of natural language and return meaningful responses,but also complete a certain task through a series of dialogues.Generally speaking,end-to-end chatbots can be mainly divided into two categories:retrieval-based chatbots and generation-base chatbots.The generation-base chatbots adopt natural language generation technology to regenerate responses,which based on the history of previous conversation.The generative system is expected to surpass the limitation of pre-built responses,however,it suffers from the lack of fluency and tends to generate safe responses.In contrast,the retrieval dialogue system mainly employs information retrieval technology to score a set of pre-defined response candidates and return the most appropriate response,which can provide a fluent and meaningful response in most cases.Yet low-quality candidate corpus may affect the rationality of the retrieval dialogue system.The insufficient number of response candidates might significantly reduce the diversity of returned responses.In order to tackle above problems,this paper aim at studying multi-turn retrievalbased chatbots in the large-scale candidate set scenario.First,this paper proposed a spatio-temporal matching network and study the performance and effectiveness of spatiotemporal matching network with a large amount of candidates.Moreover,this paper analyze the interpretability of spatio-temporal features and its relative merits.Through comparative experiments and visual analysis,this paper proves that the retrieval multi-round dialogue model based on the spatio-temporal matching feature can achieve better performance with lower time complexity in a large-scale candidate set scenario.Meanwhile,this paper focuses on the semantic understanding ability of the end-toend dialogue model,and introduces a pre-trained language model into retrieval-based chatbots.This paper then proposes a speaker segmentation strategy and a multi-turn dialogue augmentation method to improve the performance of pre-trained dialog retrieval models.By splitting speaker utterance,introducing speaker-related embedding,and applying specific data augmentation methods,the pre-trained dialogue retrieval models can better model the consistency and logicality of multi-turn dialogues.Comparative experimental results show that out methods surpass a large number of baseline models and achieve better performance in larger-scale candidate sets.

Keywords/Search Tags:

retrieval-based chatbots, large-scale candidate set, spatio-temporal matching(STM) features, pre-trained dialogue retrieval models

PDF Full Text Request

Related items

1	Study On Multi-turn Response Selection For Retrieval Chatbots Based On Deep Learning
2	Research On Retrieval-based Multi-turn Dialogue With Deep Context Modeling
3	Research On Spatio-temporal Index Model And Retrieval Methods Based On HBase
4	The Design And Implementation Of Dialogue Robot Based On Knowledge Graph
5	Research On Sequential Pattern Mining And Its Parallelization Method For Large-Scale Spatio-Temporal Trajectory Data
6	Research On Retrieval-based Dialogue System Based On Knowledge Fusion Matching Network
7	Research Of Video Spatio-temporal Feature Extraction And Retrieval Algorithm Based On Deep Learning
8	Large Scale Video Retrieval And Feedback With Multi-level Content Represeentation
9	The Research For 3D Model Geometry Shape Similarity Matching
10	Research On Technology Of Content-Based Large-Scale Image Retrieval