Font Size: a A A

Research On Intelligent Question Answering Technology For Long Documents

Posted on:2024-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiaoFull Text:PDF
GTID:2568307151460484Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet and the development of search engines,people’s access to text information is becoming more and more convenient.However,these increasing texts and information may be rich in content and too long,which will cause people a series of problems such as redundant information and difficulty in finding.Using intelligent question answering technology to retrieve and read long documents allows people to get the answers they need more quickly and accurately.This paper studies t he application of intelligent question answering system to two phased tasks on long documents,namely document retrieval and span extraction of machine reading comprehension tasks.The main research work is as follows:The task of document retrieval refers to finding the documents that may contain the correct answer in the dossier for a given question.Because important information in long documents is sparse,traditional retrieval models cannot adequately capture useful matching information to poor results.To solve this problem,this paper proposes a retrieval model based on the key sentence extraction mechanism.It mainly includes two modules: key sentence retrieval and evaluation,firstly,the retrieval module searches the full text according to the current problem to find out the key sentences and form them into key short texts;The evaluation module is responsible for matching questions with the short texts and scoring results.In this way,the model can capture important information from a long distance,so as to better retrieve documents that are highly relevant to the question.Finally,the document retrieval model proposed in this paper is experimentally analyzed on the question answering dataset Hotpot QA and the text matching dataset TREC Robust 2004,which verifies the effectiveness and superiority of the model.The task of span extraction of machine reading comprehension refers to extracting the correct answer to a question from a specific document.Existing extractive machine reading comprehension methods for long documents generally use a sliding window to segment the document first,then make an answer prediction for each paragraph separately,and finally select the highest score as the final answer.This method is simple,but the answer to the paragraph with the highest score is not necessarily the best answer for the whole text.To solve this problem,this paper proposes a machine reading comprehension model based on answer fusion mechanism.It mainly contains a paragraph reading module and a text reading module.First,the paragraph reading module predicts the regional answer of each paragraph,then the text reading module predicts the global answer on the key short text composed of regional answers,and finally selects the final answer among all the answers through the voting strategy.In this way,the answers of all paragraphs are processed in a unifo rm standard on the same text,so as to select the best answer in the whole text.Finally,the machine reading comprehension model proposed in this paper is experimentally compared and analyzed on the machine reading comprehension dataset Trivia QA and the multiround dialogue dataset Qu AC,which verifies the effectiveness and superiority of the model.
Keywords/Search Tags:intelligent question answering, long documents, document retrieval, key short text, machine reading comprehension
PDF Full Text Request
Related items