Font Size: a A A

Research On Machine Reading Comprehension Algorithms Based On Deep Learning

Posted on:2024-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WuFull Text:PDF
GTID:2568307073468334Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of artificial intelligence technology,machine reading comprehension has attracted a lot of attention from academia and various industries as a new research direction.The main research goal of this direction is to let computers understand natural language text and return accurate answers based on the given questions.Research on deep learning-based machine reading comprehension methods has made great breakthroughs in recent years.Compared with traditional machine learning methods,deep learning methods usually contain multiple layers of neural networks that can extract text feature information from different levels.For extractive machine reading comprehension,its general architecture mainly contains four functional layers: embedding representation layer,encoding layer,attention layer,and answer prediction layer.However,existing models focus on modeling local interactions or global interactions between texts,and this approach,which does not focus on both local and global structures,can easily lead to problems such as inadequate semantic machine understanding and inaccurate answers to questions,etc.To address these problems,this paper aims to investigate and analyze existing reading comprehension models and achieve an understanding of deep semantic information of texts by improving the models,and the specific work is as follows:1.Theoretical research related to machine reading comprehension and deep learning.We study and master the principles of various neural networks and attention mechanisms,based on which we reproduce the most representative random answer network models and Span BERT pre-trained language models for extractive machine reading comprehension tasks in recent years,study their network structures and training methods in depth,and lay the foundation for the subsequent improvement of model structures.2.Research on machine reading comprehension methods that integrate convolution and attention mechanisms.To address the problem of losing contextual information due to the independence of current input and hidden state when using traditional LSTM as encoder in the encoding layer,a variant of LSTM network is introduced to enhance the information perception and interaction between text contexts and obtain richer semantic representation.Also considering that the attention mechanism cannot effectively extract the local structure of the text,which hinders the model to understand the deep semantics of the text,a combination of dynamic convolutional attention and multiple attention mechanisms is used in the attention layer to capture the structure of the text from different scales.In addition,to better accomplish the fragment extraction type task,a fragment dynamic convolutional attention mechanism is proposed by extending the structure based on dynamic convolutional attention.Experiments on the public dataset SQu AD1.1 show that both the improved model and the extended structure can improve the machine’s ability to answer questions.3.A study of machine reading comprehension methods based on pre-trained language models and attention mechanisms.To address the problem that the traditional word representation cannot adequately express the semantic information of words,the pre-trained language model Span BERT is used in the embedding representation layer to obtain a vector representation.Also to achieve effective modeling of text structure,the contribution of different attention to the model is investigated in the attention layer,where two different attention methods are purposefully employed: dynamic convolutional attention,in which the convolution and attention mechanisms capture the local and global structure of the text comprehensively;and synthetic attention mechanism,which captures deep semantic information and improves the training and inference speed.Experimental validation on the public datasets SQu AD1.1 and SQu AD2.0 and analysis of the results demonstrate that building the model can effectively improve the machine’s ability to answer questions.
Keywords/Search Tags:Machine reading comprehension, Deep learning, Attention mechanism, Pre-trained language models
PDF Full Text Request
Related items