Font Size: a A A

Research On Reply Quality Evaluation Methods Of Open-domain Dialogue Systems

Posted on:2022-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z X FengFull Text:PDF
GTID:2518306572965689Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous advancement of computer-related technologies,the development of human-machine dialogue systems has also reached a new level.As an important part of the human-machine dialogue system,the opendomain dialogue system has also received more and more attention.The open-domain dialogue system originated from the Turing test,and its interaction method is closer to the form of human-human dialogue in real life,usually giving people a natural and cordial communication experience.As one of the important directions in the field of artificial intelligence,the development of the open-domain dialogue system is very important for the construction of a social form of human-machine integration in the future.However,the academic is still unable to evaluate the open-domain dialogue system reasonably and objectively.This current situation has deeply hindered the development of the open-domain dialogue system.In this paper,we explore the evaluation method of open-domain systems from two aspects,and apply the proposed method to the Multi-Bot Conversation task.Aiming at the problem of "one-to-many" that has always existed in open-domain dialogue systems,this paper proposes a simulated manual evaluation method using multi-reference for dialogue evaluation.For the utilization of multi-reference,three fusion methods with different granularities were proposed,and related modeling was carried out on the representation model(ADEM-based)and the interactive model(BERT-based)respectively.Experiments shown that the interactive model performs better than the representation model.More specifically,the Char-Level fusion method based on the BERT model has the best effect,and the Score-Level fusion method has the most prominent advantages.Aiming at the problem of robustness which has been ignored in the open-domain dialogue system,this paper proposes a set of robustness evaluation methods.The robustness of the model is evaluated by constructing a test set of adversarial samples.In the construction process,the Chinese and English corpus are selected to adversarial perturbations.At the same time,two evaluation methods,absolute index and relative index are proposed.Finally,several models are tested,and the experimental results are analyzed and summarized.Finally,we consider how to improve the evaluation quality in the evaluation mode,and we apply the above two methods to the Multi-Bot Conversation task.It not only creates a more consistent semantic environment for the evaluation work,but also dynamically test the ability of the model to simulate the dialogue.At the same time,it also solves the shortcomings that Multi-Bot Conversation task often use manual evaluation.According to the particularity of Multi-Bot Conversation task,we propose an evaluation method based on multiple topics.At the same time,we add a perturbation strategy for multi-turn dialogue task to evaluate the robustness.The effectiveness of the above methods are proved on the data set.
Keywords/Search Tags:open-domain dialogue system, simulated manual evaluation, multireference, robustness, adversarial samples, multi-bot conversation
PDF Full Text Request
Related items