Font Size: a A A

Research On Methods Of Machine Reading Comprehension Of Unstructured Chinese News Text

Posted on:2022-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:S L XuFull Text:PDF
GTID:2518306569994569Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Machine reading comprehension,as a key research direction of artificial intelligence,is widely used in search engine,dialogue system and other fields.Its purpose is to let the machine read text like human,and then answer questions according to the understanding of the text.Machine reading comprehension tasks can be classified in various forms.This thesis focuses on span extraction machine reading comprehension,which means the program will find the answer from the specific position of the text under the premise of a given question and context.Under the framework of deep learning,this kind of machine reading comprehension task usually contains four key modules(Embedding,Feature extraction,Context-Question interaction and Answer prediction).This thesis will focus on feature extraction module and Context-Question interaction module.In terms of data sets,the Chinese resources related to specific fields are still blank.To solve this problem,this thesis constructs a new Chinese data set News-CMRC.The data set is constructed to solve the problem of reading comprehension in the field of Chinese news,which can increase the language diversity in the field of machine reading comprehension.In feature extraction module,this thesis proposes an intention based answer screening model.First of all,the deep learning method is used to classify the questions by multi-tags to get the question intention.Then,the intention is matched with the named entity in the text to find the information of the text fragment with the intention,and the qualified fragment is filtered out as an alternative answer.The model is tested on the data set News-CMRC and SQu AD.The experimental results show that the screening process can effectively reduce the influence of context independent information on the answers and improve the accuracy of answering some types of questions.In Context-Question interaction module of MRC,we propose a multi-attention mechanism model based on intention screening model.The specific approach is to combine the problem intention with the question text into a tagging problem,and then send it into the attention model together with the text information.The model uses BERT as the underlying support,and uses multi-attention mechanism and forward neural network to encode the problem and context with intention.The experimental results show that the proposed model can improve the ability of capturing effective information and improve the accuracy of answering some types of questions compared with unlabeled questions.Based on the research work of this thesis,the proposed model is applied to the scene of Chinese News newspaper reading comprehension,and a news newspaper machine reading comprehension system is designed and implemented.At the same time,News-CMRC data set is used as the data support of the model training,and the intention based answer screening model and the multi attention model are used as the model driven.Finally,the system can extract continuous fragments from the text as answers to the user's questions under the given context,and realize the complete workflow.
Keywords/Search Tags:deep learning, natural language understanding, machine reading comprehension, intention recognition, span extraction
PDF Full Text Request
Related items