Human-machine dialog is aimed at enabling people to interact with machines using natural language to get automatic information service in a convenient and fast way.As a key component of dialog system,Dialog State Tracking(DST)represents and updates the system state during the dialog progresses.Whether it is updated correctly or not is directly related to the performance of action generation,which will affect the performance of the entire dialog system.Therefore,it is the basis of dialog strategy generation,in order to realize high-quality human-machine dialog system,the research of DST is significant.Traditional DST is based on the outputs of Natural Language Understanding(NLU),NLU errors will diffuse in DST.Therefore,DST researches in recent years directly take the word sequence as inputs to joint model NLU and DST tasks,and have achieved a lot of progress.However,there are also some important issues,among which data sparseness and unknown slot values have important implications.This paper focuses on these two issues and carries out a series of work based on a comprehensive analysis of the existing researches.The main contents include:Aimed at the problem of data sparseness,a cascaded neural network(CaNN)model is proposed.The model has a two-layer structure,where the bottom layer adopts a Long Short Term Memory(LSTM)network or a Convolutional Neural Network(CNN)to obtain low-dimensional sentence representation,and the upper layer adopts an LSTM to integrate history information of dialogs for DST based on low-dimensional sentence representation.Experimental results on open dataset show that our combined models achieve better performance than existing ones,with improvements of 1.0%,1.8%,4.5%,5.2%compared to state-of-the-art models in terms of "joint"performance.The dialog state learned by the CaNN models has better aggregation tightness than those learned by other models with n-grams in the word sequence as inputs.This suggests that the proposed CaNN models effectively alleviate the problem of data sparseness.Aimed at the problem of unknown slot value,two models are proposed based on the CaNN model:one is the cascaded neural network with unknown slot value detector(USVD-CaNN)model,an unknown slot value classifier is combined based on the CaNN model and the model adopts pseudo samples for classifier training,so that the model can deal with unknown slot values.Experimental results show that the proposed USVD-CaNN model has performance improvements of 6.11%,6.56%and 13.27%on the DSTC3(Dialog State Tracking Challenge)、DSTC2-food、WOZ-food(Wizard of Oz)datasets over the corresponding models respectively.The USVD-CaNN model can effectively alleviate the problem of unknown slot value in DST on the basis of guaranteeing the performance of known slot values,thus greatly improve the accuracy of DST with unknown slot values,which illustrates the effectiveness of the proposed USVD-CaNN model in dealing with DST with unknown slot values.However,the USVD-CaNN model adds an extra unknown slot value classifier to detect unknown slot value,which is not straightforward,pseudo samples reduce the scale of the training set,affecting the performance of known slot values,and the pseudo samples are limited by the scale of the training set,which is not flexible.The other is the cascaded neural network with unknown class(UC-CaNN)model,an unknown slot value class is designed in the model,and the function of detecting unknown slot values is integrated into the model,thus there is no need to design an unknown slot value detector.Meanwhile,we use the shared context to construct negative samples for known slot values by negative sampling.The constructed negative samples together with the original training set are used to tune model parameters,and the potential ability of detecting unknown slot values in the UC-CaNN model is motivated,then the model is able to handle unknown slot values.Experimental results show that the UC-CaNN model achieves good performance,and has performance improvements of 5.36%,5.29%and 3.95%on the DSTC3、DSTC2-food、WOZ-food datasets over corresponding models respectively.In particular,the ability of handling unknown slot values greatly improves the final performance of DST.Compared with the USVD-CaNN model,this method can alleviate the problem of unknown slot values in DST more directly,and the model calculation is simpler.Moreover,through the analysis of the experimental results,it can be found that negative sampling has less influence on the performance of known slot values and is more flexible.Finally,the DST models proposed in this paper are applied to the human-machine dialog system for restaurant inquiry,and the DST module of the human-machine dialog demonstration system oriented to restaurant inquiry is realized.Through the function demonstration of the dialog system,the feasibility of the proposed CaNN models,the USVD-CaNN model and the UC-CaNN model applied in practical task-oriented human-machine dialog system is verified. |