Research And Implementation Of Question Answering System Based On Unstructured Text

Posted on:2020-10-25

Degree:Master

Type:Thesis

Country:China

Candidate:Q Liu

Full Text:PDF

GTID:2428330575957078

Subject:Computer Science and Technology

Abstract/Summary:

In recent years,with the rapid development of mobile internet and big data,unstructured web pages and documents in various vertical fields have accumulated rapidly.As a high-level form of information retrieval,automatic question answering system based on unstructured text has gradually become a research hotspot in recent years by analyzing user's real intention and extracting clean and accurate answers from retrieved documents.However,there are still many problems in most open research works at present:1)In the Q&A scenario,the question and document length are seriously unbalanced.The lack of fine-grained semantic level similarity matching in the information retrieval module makes it difficult to meet the precise retrieval requirements;2)In the Chinese context,the mainstream machine reading comprehension model has not been fully verified,and there is room for improvement in performance;3)The current automatic question answering technology based on large scale unstructured texts is not good enough,and there are relatively few platforms in a vertical field.This paper focuses on the key technologies of document information retrieval and answer extraction in automatic question answering system based on the unstructured text,optimizes the algorithm and realizes the system.The main research work includes:(1)Proposed a semantic similarity matching model(Deep-HAN-Matching)which based on hierarchical attention mechanism to solve the problem of semantic similarity matching caused by the length imbalance between query and document in question answering system.The performance of WikiQA,a open dataset,is improved a lot than common baseline models by abstracting and extracting features layer by layer from word level and sentence level using attention mechanism;(2)Proposed a machine reading comprehension model(BiDAF-GCN-SelfAtt)based on gated convolutional neural network and self-attention mechanism,to solves the difficulty of context representation and interactive matching feature fusion in BiDAF when model the long text.On DuReader,the ROUGE-L and BLEU-4 are improved by 2.8%and 5.2%respectively compared with the baseline model;(3)Integrated the proposed algorithms and implemented an automatic question answering system based on unstructured text in the field of clinical medicine.Experiments proof that the two proposed models have good applicability in clinical medical labeling data sets.At the same time,the accuracy of Top1 in the test set of Clinical Medical Professional Examination in 2018 is significantly improved compared with that of the baseline system.

Keywords/Search Tags:

unstructured text, automatic question answering, reading comprehension, information retrieval, attention mechanism

Related items

1	Automatic Question Answering Method Based On Retrieval And Machine Reading Comprehension
2	Research On Reading Comprehension Method Of Automatic Question Answering For Civil Aviation Customer Service
3	Research On Machine Reading Comprehension Model For Question Answering System
4	Research On Deep Learning-based Multi-document Passage Ranking Methods For Question Answering System
5	Research On Reading Comprehension Style Question And Answering Model Based On Attention Mechanism And Neural Network
6	Research On The Factoid Question Answering Based On Attention Pooling Mechanism And External Knowledge
7	Design And Implementation Of Question Answering System Based On Machine Reading Comprehension
8	Design And Implementation Of After-sales Question And Answer System Based On Machine Reading Comprehension
9	Design And Implementation Of Knowledge Base Question Answering System Based On Reading Comprehension
10	Research On Machine Reading Comprehension And Textual Question Answering