Research And Implementation Of Grammatical Error Correction Based On Recurrent Neural Network

Posted on:2019-05-03

Degree:Master

Type:Thesis

Country:China

Candidate:L Yang

Full Text:PDF

GTID:2348330545961549

Subject:Intelligent Science and Technology

Abstract/Summary:

PDF Full Text Request

English is the most widely used international language in the world.And it is,as a second language,valued by more and more learners(English as Second Language,ESL).However,ESL learners face a variety of challenges such as listening,speaking,reading and writing because of differences in culture,geography and living habits.Among which writing is the most important and most difficult,because there may be a lot of grammatical errors.English Grammatical Error Correction(GEC)is extremely important for both English learners and English teachers.In this thesis,we propose a sequence labeling model based on recurrent neural network to solve the sequence labeling problem for ESL corpus which is full of grammatical errors.Then we propose a grammatical error correction method based on sequence labeling model,and a grammatical error correction method based on sequence-to-sequence model.Firstly,our sequence labeling model based on recurrent neural network improves the accuracy of part-of-speech tagging in ESL corpus to 96.73%;and the accuracy of the part-of-speech tagging for WSJ(Wall Street Journal)corpus reaches 97.60%;in the CoNLL2003 named entity corpus,F1 value reaches 91.38%.Then,we apply our sequence labeling model to the GEC task,the F1 value reaches 38%in the determiner error correction,better than UIUC which is the most excellent result(33.40%)in CoNLL2013 GEC task;and 28.89%F1 in the prepositional errors,better than UIUC which is 7.22%.And at last,we build a sequence-to-sequence model for GEC task using our sequence labeling model,and our model reaches the F0.5 value of 31.77%in the latest CoNLL2014 GEC data and the recall value of 38.92%,better than CAMB(30.10%)which is the best result in CoNLL2014 GEC task.The contributions of this thesis can be summarized as follows:1.Propose a neural network model to effectively solve the sequence labeling.Our network maintains high labeling accuracy in standard corpora like news and non-standard corpora like English essay written by ESL learner.Different from the previous labeling model,our model uses character-level,word-level and sequence-level information,and introduces the rough supervision and divides the labeling process into two stages,which makes the labeling process more robust.2.Propose a method of detecting and correcting English grammar errors by using our sequence labeling model.This method surpasses the best result in the CONLL2013 GEC review.3.Propose a method of detecting and correcting English grammar errors by using our sequence-to-sequence neural network.The encoding part of our model comes from our sequence labeling model,and the Attention mechanism is introduced into the decoding part.4.Design and implement an English grammatical error detection and correction system by using both our sequence labeling model and sequence to sequence model.

Keywords/Search Tags:

grammatical error detection and correction, recurrent neural network, sequence labeling model, seq2seq model, esl corpus

PDF Full Text Request

Related items

1	Research On Chinese Grammatical Error Correction Based On Sequence-to-Sequence Model
2	Research On The Proofreading Method Of Chinese Typos Based On Sequence Labeling Mode
3	Research And Implementation Of Grammar Error Correction Model Based On Deep Learning
4	The Research Of Grammar Error Correction
5	Research On Key Techniques Of Chinese Grammar Error Correction Based On Neural Network
6	Automatic Grammatical Error Detection Technology And Application For Chinese Text
7	Research On Recurrent Neural Network Based Dependency Parsing Model
8	Research On Automatic Detection And Correction Of English Grammar Errors Based On Machine Translation
9	Grammatical Error Correction Based On N-Gram Model And Parsing
10	Research On Sequence Labeling Model Of Natural Language Processing Based On Deep Learning