Font Size: a A A

Design And Implemention Of Improved Self-attention Based Machine Reading Comprehension

Posted on:2020-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:L YaoFull Text:PDF
GTID:2428330590483067Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the Web 3.0 era,a large amount of text data has been generated on the Internet.machine reading comprehension technology based on deep learning plays an important role in the retirval of these text data.In addition,machine reading comprehension thchnology is a significant part of artificial intelligence.Therefore,machine reading comprehension has received the attention of many scholars in recent years.It finds an answer to the given question from the given article.In this thesis,machine reading comprehension task will face multi-language problems,word segmentation process will be required for the Chinese text data.We proposed a Chinese word segmentation model Attention-CRF based on recurrent neural networks and conditional random fields.After using the conditional random fields,the transition probability is included in the cost function,so the segmentation result both follow the principal of character level and take the long-term information into consideration,finally the predicted answer will be more accurate.Besides,In this thesis,a machine reading comprehension model which is based on the Improved Attention Mechanism will be introduced.Aiming at the deficiency of text encoding ablility in the existing BiDAF model,we used a novel self-attention mechanism to enhance the exsting model's encoding ability and proposed the self-BiDAF model.First,we convert each word in the article and question into a word embedding,and then use the recurrent neural network to process the word embedding to obtain a new vector representation containing the correlation between words.Based on the vector representations of article and question,calculate the attention of article-to-question and the attention of question-to-article,decompose the machine reading comprehension problem into the problem of word similarity calculation.Finally,we use the self-attention mechanism to self-match the text information and recalculate the embeddings of the text.Experiment results on Squad dataset and DuReader dataset show that the proposed method outperforms the state-of-art algorithms in terms of higher precision and higher recall.
Keywords/Search Tags:Machine reading comprehension, Self-Attention mechanism, Chinese word segmentation, Conditional random fields
PDF Full Text Request
Related items