Font Size: a A A

Research On Document-Based Question Answering

Posted on:2019-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZengFull Text:PDF
GTID:2428330548466895Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Document-based Question Answering(DBQA)task refers to a problem that occurs in the form of natural language.Each question uses a candidate document-based to find the most relevant answer in the text library and return it.The difficulty is that the information in the document-base is unstructured,and the knowledge is not limited to a single field.The question-answering task based on the document base is one of the research hotspots in the field of natural language processing and has been widely applied in practical life.The traditional method of solving the Q&A task based on the document base is to filter the candidate answers which is used in the information retrieval,develop the method to use the machine learning method to sort the answer documents,return the best answer.Recent years,the researches combine the deep learning method to solve the task.As deep learning is widely used in the field of natural language processing,more and more scholars use distribute word vectors and deep learning methods to solve the problem of matching the relevance of text and make the accuracy of the task continue to increase.The work done in this paper is as follows:1)Document-based question and answer tasks are essentially the task of calculating the correlation between questions and answers.How to express short texts and fully exploit the semantic features in short texts is the key to improving the accuracy of answering.In recent years,many scholars have used deep learning models and vocabulary distributed representations to propose different types of neural network models to solve problems and answer choices.2)This thesis proposes a convolutional neural network model with a co-occurrence matrix,which can add co-occurrence information of questions and answers to the deep learning model.The co-occurrence matrix is a method of adding co-occurrence information of questions and answer texts to a discriminative model.This approach can provide richer semantic features.This article experiments with this method on the NLPCC 2017 DBQA task.The experimental results show that the method proposed in this paper can improve the accuracy of the indicators of the text library question answering system.In addition,this paper also combines distributed word vectors and character-level vectors to represent question texts and answer texts.Using multi-layer convolutional neural networks to abstract the combination of short text semantic features.Furthermore,using the abstract features and neural network models to score questions and answer documents.The model was validated using the NLPCC 2017 DBQA Chinese datasets and a document-base question answering English standard data set Wiki-QA.Compared to models using word vectors only and character-level vectors only,the model accuracy has been improved.
Keywords/Search Tags:DBQA, deep learning, NLP
PDF Full Text Request
Related items