Font Size: a A A

A Research On Question Answering Algorithm Based On Complex Structured Text

Posted on:2022-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:X L XiaFull Text:PDF
GTID:2518306764476724Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Question answering system is one of several research focuses in the field of natural language processing,and it has important applications in practical scenarios.The purpose is that people give questions to the model in textual language,and the model gives the correct answer to the question by correctly analyzing the semantic information and text content.However,with the complexity of data and scenes,the existing simple text question answering systems and pure table question answering systems have been unable to complete the question answering task.With the proposal of complex text structure data set,the innovative optimization of retrieval algorithm and reading comprehension algorithm provides a new solution direction for the existing simple question answering system.For the current complex text structure data,most of the existing question answering systems have the following three shortcomings:(1)The data processing of complex structure text is simplified and converted into plain text processing or pure table processing,so that the complex structure cannot be well used tructure information for structured data.(2)The existing question answering algorithms still use traditional matching algorithms such as TF-IDF algorithm,BM25 algorithm,and string matching,which have poor adaptability in complex environments such as scenes with similar semantics but different texts.(3)For complex questions,it is often necessary to perform multiple jumps to find the answer.The existing question answering system can only jump in a simple and limited direction in one direction,and cannot use the previously retrieved information when jumping.In order to solve the above problems,based on the Hybrid QA dataset,thesis focuses on the construction of question answering algorithms under complex structured texts from three directions: datasets,baseline models and question answering algorithms,and constructs question answering algorithms from two perspectives to solve the above problems.Thesis proposes a question-and-answer model based on the AIR algorithm,which processes data in complex structured text,retains the text information while retaining the structure information of the table to the greatest extent,and uses the AIR algorithm to make the retrieval results more accurate and solve the multi-hop problem.Finally thesis use a BERT-based algorithm for sorting and reading comprehension to determine the answer.The AIR algorithm uses the word embedding representation to semantically match the question and the candidate text,and the word embedding representation can well solve the matching problem of semantically similar texts.And through the iterative matching algorithm of the AIR algorithm,multiple iterations can be performed to retrieve the candidate text to generate the candidate path,and the position coordinates of the target text can be well retrieved to complete the multi-hop task.In addition,thesis uses the BERT model to encode and sort the candidate paths to select the best candidate path,and then uses the BERT model for reading comprehension to extract the answer.Thesis also proposes a question answering model based on the TAPAS algorithm,which can make good use of the structural information of the table by directly modeling the table,and form a comparison between different row and column data.Through the feature learning of the direct coding of the table,the coordinate position of the cell where the target answer is directly obtained,avoiding multiple text jumps.In Thesis,the experiments are verified on the Hybrid QA dataset,and both question answering models have achieved good question answering results on complex text structures.
Keywords/Search Tags:Table Question Answering, Reading Comprehension, Text Matching, Natural Language Processing
PDF Full Text Request
Related items