Font Size: a A A

Research On Question Classification Combination Model Based On Deep Learning

Posted on:2019-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2428330548972431Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The automated question answering system allows the user to ask questions in natural language and return to the user with an accurate answer.With the development of semantic analysis and other related technologies such as natural language processing and big data,under the requirements of practical applications such as smart customer service,automatic question answering systems have become a research hotspot.The open domain question answering system usually includes three major modules:question understanding,information retrieval,and answer generation.Question understanding is the first step in the Q&A system,Question Classification(QC)is a key part of the question understanding step.The question classification is to classify questions into different categories according to the types of answers.The question categorization provides constraints on the candidate answers that need to be fine-grained and validated,and secondly,the question categorization provides information that may be used to decide which answer selection strategy to use in the subsequent process.The accuracy of the question classification will directly affect the formulation of multiple strategies following the question answering system,and ultimately affect the accuracy of the extracted answers.Therefore,question classification becomes the basic task in the research of the automatic question answering system.At present,the main solutions for the task classification of Chinese questions still focus on traditional methods,including Naive Bayes,support vector machines,etc.The study of deep learning methods is still relatively scarce.In this context,this paper proposes a combination model based on deep learning.The main work of the thesis is as follows:For the Chinese question classification task,this paper designs a combined learning model based on deep learning.Character-level vector and word-level vector are introduced at the same time in the model.Besides,the two vectors adopt different processing methods.Using LSTM model to process the word-level vector,use convolution and pooling handle character-level vectors.Design model like this for the following reasons:First,both the character-level vector and the word-level vector are used in the model,and they can promote each other.Character-level vectors and word-level vectors are commonly used features in deep learning methods.However,most models use only one of them.This paper believes that different level vectors can play a complementary role.Word-level vectors can be seen as complete morphemes,it contains complete semantic information.But because of the imperfection of word segmentation technology,there are semantic errors or lack of semantics caused by word segmentation;character-level vectors effectively avoid errors due to word segmentation.However,because Chinese morphemes are not necessarily single words,the character-level vector actually destroys the integrity of the morpheme.With this consideration,the model introduces both character-level vectors and word-level vectors at the same time.The two promote each other and improve the performance of the model.Secondly,the different vectors are extracted in different ways.The model uses LSTM and CNN to extract word-level features and character-level features of Chinese questions,and combines the two types of features,using the combined multi-angle features as the feature vector of the original question.After implementing the model designed in this paper,a series of comparative experiments were conducted.The comparison objects included traditional machine learning methods(naive Bayes,support vector machines)and basic machine learning methods(convolutional neural networks,recurrent neural networks,etc.).And other contrastive experiments on different treatment methods for word vectors and character vectors.This model has achieved 93.13%accuracy in corpus of Harbin Institute of Technology.A series of experimental results have proved the rationality,scientificity and validity of the model design.Through the research of this paper,we have successfully introduced deep learning into the question classification field,and achieved good results.It also provides a feasible idea for solving question classification task.At the same time,it has certain reference significance for further research on question classification.
Keywords/Search Tags:Question Classification, Deep Learning, LSTM, Convolutional Neural Networks, Character-Level Vectors
PDF Full Text Request
Related items