Font Size: a A A

Zero Resource Online Update And Security For Knowledge-grounded Dialogue System

Posted on:2023-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:J C LinFull Text:PDF
GTID:2568307046492994Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
As an important part of human-computer interaction,dialogue system has a wide range of applications in many fields in the near future.By introducing additional knowledge sources,dialogue systems can generate responses that are more in line with objective facts.Dialogue systems using this approach are called knowledge-grounded dialogue systems,and most of the current knowledge-grounded dialogue systems are based on large-scale pre-trained models due to the achievements of large-scale pre-trained in the field of natural language.In practical application,the dialogue ability of the knowledge dialogue system needs to keep pace with the objective development of the real world,but the current knowledge dialogue system still faces challenges under this goal.Because the knowledge-grounded dialogue system requires that each pair of materials need to be strictly aligned with the corresponding knowledge of the training data,the training data cannot be automatically collected on social media.At present,the data of knowledge-grounded dialogue needs to be manually annotated,and the cost in time and money is huge.To alleviate this problem,this paper proposes a method for online update of zero resources of knowledge-grounded dialogues to reduce the cost of data required for their online update.Specifically,our proposes a way to alleviate the difficulties caused by the cost of data annotation to the online update of the knowledge-grounded dialogue system by generating Pseudo Data.The method is divided into two cases of missing overall dialogue corpus and missing contextual dialogue corpus according to different degrees of missing dialogue data.Template-based,generative modeling,and data retention approaches were used to cope with the overall dialogue corpus.In order to cope with the miss of context,this paper also proposes a Pseudo Data generation method based on retrieving the text corpus in non-conversational form from the Internet.Experiments on the KdConv dataset and manual measurements show that the method proposed in this paper is feasible and generalizable.On the other hand,it has been shown that large-scale pre-trained models have security risks such as leaking training samples due to overfitting,which can cause leakage of private information on the Internet.Therefore,this paper proposes an analysis scheme for analyzing the security risks of knowledge-grounded dialogue systems in overfitting leading to privacy leakage and manipulation by attackers based on the characteristics of knowledge-grounded dialogue systems.It includes the steps of prompt design,decoding strategy,filtering sorting,and data verification.By setting different parameter configurations,the security characteristics of knowledge-grounded dialogue are fully tested in this paper under this scheme.
Keywords/Search Tags:Dialogue systems, Knowledge graph, Online updates, Model security, Natural lan-guage processing
PDF Full Text Request
Related items