Font Size: a A A

Question Classification Method And Its Application In Question Answering System

Posted on:2019-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330545959669Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Traditional information retrieval systems use keyword combinations as the input of system,ignoring questions' semantic diversity and language structure analyzing.The question answering system can accept natural language questions,which can find or infer answer to users' questions from a large number of heterogeneous data and improve users' query efficiency.Therefore,question answering system has become an inevitable trend for information retrieval technology to the direction of humanization and intelligence.The purpose of question analysis is to clarify the user's intention,and effectively locate the right answer.Therefore,the question analysis is one of the core techniques of the question answering system,question classification is an important part of the question analysis.After deeply studying the current research methods of Chinese question classification and question answering system,this thesis proposes a question classification method based on the maximum entropy model and bidirectional LSTM(Bi-LSTM)networks model.The specific research work is as follows:(1)We proposed a question classification method based on Maximum Entropy Model.This method applies semantic knowledge,such as syntactic structure and word vector,to represent question,and studies the impacts of lexical feature,syntactic features,and word vector features on the accuracy of coarse classification of questions.Experimental results showed that compared with other features,the word vector feature has a good effect on coarse classification of questions,and the accuracy rate reaches 88.75%.(2)This thesis proposed a question classification method based on Bi-LSTM.The question classification method based on the maximum entropy model needs manually extracting the features of the question,with a certain degree of subjectivity.The question classification method based on Bi-LSTM can autonomously learn the syntax and semantic features of question and avoid the interference caused by human factors.In the classification model,this thesis uses words,parts of speech and wordslocation as features.The word embedding obtained by the fusion of the three feature vectors is used as the input of the model,the output is used to obtain question feature and coarse-grained classification through Max Pooling layer and Softmax layer.The experimental results showed that the accuracy is 92.38% on coarse classification.(3)The application of the question classification in the knowledge base question answering system.This thesis uses the Ranking SVM algorithm to sort the candidate answers by using the question classification,the similarity,the editing distance and the co-occurrence feature.Experiments are conducted on the data set of the NLPCC2016 open domain knowledge base question answering system.The results showed that applying the question classification to the answer sorting of the knowledge base question answering system helps improving the accuracy of answer recognition.The accuracy rate reaches 74.49%,the recall rate reaches 83.20%,and the average F1 reaches 76.13%.
Keywords/Search Tags:Question parsing, Question classification, Maximum Entropy Model, Bi-LSTM, Ranking SVM, Knowledge base question answering system
PDF Full Text Request
Related items