Font Size: a A A

The Design And Implementation Of Question Answering System Based On Retrieval And Answer Generation Hybrid

Posted on:2020-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:X S LiFull Text:PDF
GTID:2428330572496538Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the popularity of the Internet,everyone can easily obtain a huge amount of information from the Internet.However,the massive data also brings great challenges to information explosion and information overload,which makes people fall into the ocean of massive data,and the precise positioning of information becomes more and more difficult.The emergence of search engines has alleviated the problems caused by these challenges to some extent.However,the traditional search engine does not dig deeper semantic information,and the user still needs to manually traverse the information,which is time-consuming and laborious.The emergence of a question answering system provides a completely new solution for the precise positioning of information.Through the question answering system,users can express their information needs in a natural language,and the system will directly return accurate and concise answers.Currently,the models commonly used in question answering systems are mainly retrieval models and answer generation models.The retrieval-based model has the advantages of simplicity and strong interpretability.However,due to the limitation of corpus size,the problem that can be answered is limited,and the semantic information mining of the problem is insufficient.The answer generation model can generate answers by mining the deep semantics of the questions.The answers can be free from the size of the corpus,but the model is not interpretable and tends to generate general,monotonous answers,which cannot guarantee the rationality and consistency of the answers.For the question answering system,the technology to be used is different because of the different forms of corpus data.In most cases,the corpus data form is most commonly obtained with free text and question-answer pairs.Therefore,this thesis designed two different question answering systems oriented to two different data forms.In addition,based on the problems of the retrieval model and the answer generation model,this thesis proposes a hybrid model based on retrieval and answer generation to combine the advantages of both.Therefore,the main contributions of this thesis are as follows:1.Based on the actual application requirements,based on the open source search engine Solr and Learn to rank ordering model,a retrieval-based question answering system oriented to free text is designed and implemented in the way of searching answer by using question.2.In order to combine the advantages of both the retrieval model and the answer generation model,based on the retrieval model and the Seq2Seq model,a hybrid question answering system based on retrieval and answer generation model oriented to the question-answer pairs is designed and implemented.the search model first indexes the answer pairs in the corpus,and then searches for the problem,the most similar questions are retrieved by using question,and the answers to the similar questions are used as the candidate answers to the original questions,and all candidate answers are reranked by the Seq2Seq-based reordering model as ascending.The answer with the lowest score is used as the answer to the retrieval model.When the score of the answer to the retrieval model is lower than the confidence threshold,the answer is directly returned.Otherwise,the answer is directly generated by the Seq2Seq model;3.The rationality and effectiveness of the two systems are verified by the experiments which performed on the self-built dataset,the InsuranceQA dataset,and the UbuntuDialogCorpus dataset respectively.
Keywords/Search Tags:Question answering system, IR, Seq2Seq, Learn to rank
PDF Full Text Request
Related items