Font Size: a A A

Short-Spoken Language Intent Classification With Conditional Sequence Generative Adversarial Network

Posted on:2021-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhouFull Text:PDF
GTID:2428330632462711Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Natural language understanding(NLU)is a subtopic for natural language processing(NLP)aiming at automatically converting non-linguistic data into a kind of semantic representation which can be directly processed by machines.A dialogue system is a common application area for NLU,which can be divided into chatting ones,task-based ones,recommended ones and so on according to the functions.Especially,the core of the task-based dialogue system is intent classification,which is a typical text classification issue.Most traditional classification methods apply to this issue,such as support vector machine(SVM)[1],maximum entropy[2]and so on.With the rapid development of deep learning,deep belief networks(DBNs)have been applied to intent classification,which consider label transition probability but is not flexible to use only fixed window size.Text-CNN proposed captures the n-gram feature representation by 1-dimensional convolution,which has a strong ability to extract shallow features of texts and works well in short text intent classification.A recurrent neural network(RNN)and a long short-term memory(LSTM)model for intent classification can model arbitrary long dependency.Furthermore,to solve the ambiguous in multi-turn dialogue,contextual language understanding models are proposed based on sequence-to-sequence,bi-RNN and so on.These methods all require more labeled training data.However,in most application scenes of intent classification,labeled data is lacking,which restrict the system improvement.More conveniently,we propose a conditional sequence generative adversarial network(cSeq-GAN)for short-spoken language intent classification to solve the problem of lacking labeled training data,which does not need to generate samples directly but optimizes the discriminator by the output of the generator.The basic idea is to use a large amount of unlabeled data to strengthen the generator,so as to produce more samples to improve the performance of the classifier.In addition to lacking expected labeled data,other constraint to intention recognition is the response time of the system.Complex algorithms could act well,but slowly.As a part of a system,there are usually strict requirements on the response time of the whole system.This requires that the accuracy of classification need to be guaranteed and in the same time,the matching can be achieved quickly.To solve this problem,a fast intention recognition and matching algorithm based on Bert and DSSM is proposed.The experimental results show that both algorithms have achieved the goal.
Keywords/Search Tags:natural language understanding, natural networks, deep learning, dialog system, GAN
PDF Full Text Request
Related items