Human-computer dialogue systems refer to systems that communicate between humans and computers through natural language,and can be roughly divided into three categories: intelligent question-answering systems,open-domain chat systems,and task-based dialogue systems.Different from intelligent question answering systems and open-domain chat systems,task-based dialogue systems aim to help users complete specific tasks,such as restaurant reservations and airline ticket reservations,through multiple rounds of interaction.Dialogue state tracking is a core component of task-based dialogue systems,which can accurately capture the user’s intention expressed in the dialogue context,thereby helping the system to better understand the user’s needs.However,in past studies,the difficulty difference between different tasks is often ignored,making it difficult for multi-slot learning to capture the association between slots.In addition,existing models are often built based on large pre-trained models,which is not conducive to the real-time performance and deployment efficiency of dialogue systems.Therefore,this thesis proposes some solutions to deal with these problems:(1)Aiming at the phenomenon that multi-slot learning in dialogue state tracking leads to model training biased towards simple slots,a quantitative definition of the difficulty of slot classification is defined.Based on this,a weighted joint optimization method based on the difficulty of slot classification is proposed,so that the model pays more attention to the slots that are difficult to classify,and improves the accuracy of the dialogue state tracking model.(2)Based on knowledge distillation,this thesis proposes a lightweight ontology-based dialog state tracking model KD-DST.Aiming at the problem that the existing dialogue state tracking model has a large number of parameters and a long reasoning time,the traditional knowledge distillation method is introduced into the dialogue state tracking problem,and a knowledge distillation method based on Euclidean distance is proposed.In the case of a large teacher model,distilled pre-trained model BERT and GRU module neural units.While maintaining the accuracy similar to the traditional dialogue state tracking model,it greatly reduces the amount of model parameters and inference time,which is conducive to the real-time performance and deployment efficiency of the dialogue system.(3)This thesis discusses the situation where there is no large-scale teacher model.The mutual distillation method is introduced in the dialogue state tracking model,and the mutual distillation of two student models can also achieve the effect of having a teacher model.(4)In order to evaluate the performance of the weighted joint optimization method based on the difficulty of slot classification and the model KD-DST,a series of experiments are designed in this thesis,and compared with the current mainstream dialogue state tracking model.On the public dataset Wo Z2.0,the weighted joint optimization method proposed in this thesis has an average improvement of 1.1% in the joint target accuracy of the general dialogue state tracking model.At the same time,the amount of model parameters can be compressed by 6.5 times,and the inference speed can be increased by 5.6 times.In terms of mutual distillation,it maintains model performance similar to knowledge distillation without requiring a large teacher model. |