Font Size: a A A

Chinese Question Answering Model Based On Paragraph Selection

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LiaoFull Text:PDF
GTID:2428330647457040Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data,web data is exploding,and it is difficult to obtain the needed information effectively.Compared with traditional web search engines,question answering(QA)system allows people to query in natural language and return concise answers efficiently and accurately.Despite the rapid development of the QA system,there are still many problems:(1)the QA model based on the mainstream pre-training language is large in scale;(2)in multi-document and long-text retrieval,the statistical algorithm lacks a fine-grained semantic matching,while the QA technology based on Attention mechanism is inefficient to retrieve answers;(3)uneven sample distribution and the lack of data may exist in the dataset.In order to solve the above problems,this paper proposes the framework of Chinese multidocument QA system based on paragraph selection,and uses the lightweight pre-training language model to fine-tune.In the document retrieval module,the paragraph selection algorithm is used for coarse-grained text filtering.In the answer extraction module,the text augmentation method is used to improve the robustness of the model,and the ALBERT is used to fine-tune.The experimental results on the Du Reader dataset developed by Baidu show that the accuracy of the model can be improved by using paragraph selection algorithm and text augmentation algorithm.Albert-base can compress the model size,reduce the training time and improve the frame performance.In the case of selecting 5 paragraphs and using ALBERT to fine-tune,the ROUGE-L value of the model in this paper is 1.09 points higher than that of the baseline model after adopting F1 algorithm and text augmentation algorithm.The research results can improve the accuracy of the QA system and expand the application scenarios of the QA system.
Keywords/Search Tags:Natural Language Processing, Question Answering, Pretraining language model, Paragraph Selection
PDF Full Text Request
Related items