Font Size: a A A

Research And Implementation Of Chinese Machine Reading Comprehension Based On Deep Learning

Posted on:2022-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:H L ChenFull Text:PDF
GTID:2518306575967129Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Machine reading comprehension is one of the hot research topics in the field of natural language processing,and its significance is to enable machines to understand text semantics and have the ability to reason,extract text information and answer related questions.With the continuous development of deep learning technology and the release of large-scale machine reading comprehension datasets,a large number of excellent models have been continuously proposed and improved,thus promoting the development and progress of machine reading comprehension technology.However,most existing machine reading comprehension models still have the following problems: traditional word vector generation technology cannot capture semantic information well and is low in efficiency;BiLSTM technology encodes and models the question and document,not only training on tasks with a large amount of data,and the semantic information of the context with a long distance cannot be well encoded;the semantic information of the question and the document is not sufficiently interactive and fused,so that the model cannot find the part of the document that is helpful for answering the question.Based on this,this thesis is based on the DuReader2.0 dataset,by improving the embedding layer,coding layer and modeling layer of the DuReader baseline model BiDAF,constructing a Chinese machine reading comprehension model based on Dilated Compositional Units and self-attention mechanism,the model not only has a small time cost,but also has a high accuracy in answering questions.The main research work of this thesis is as follows:First,improve the model.In order to improve the problems of the existing models,first use the Dilated Compositional Units instead of BiLSTM to encode the question and document,then use the self-attention mechanism instead of BiLSTM to model the semantics of the question and document,and finally add a pre-trained word vector instead of random initialization the word vector embeds the question and the document.Second,in order to verify the effectiveness of the improved model,this thesis conducts a comparative experiment on the large-scale Chinese machine reading comprehension dataset DuReader2.0.The experimental results show that compared with the DuReader baseline model BiDAF and Match-LSTM,the improved model in this thesis is a better model in terms of time cost and the accuracy of the model to answer questions.Third,use python web technology to design and implement a web-based Chinese machine reading comprehension system.Based on the improved model,the system adheres to the design principles of high cohesion and low coupling,and uses the Django framework to implement sub-modules,and has tested it from multiple angles.
Keywords/Search Tags:machine reading comprehension, deep learning, dilated compositional units, self-attention mechanism, pre-trained word vector
PDF Full Text Request
Related items