Grammatical Error Correction Based On N-Gram Model And Parsing

Posted on:2018-09-21

Degree:Master

Type:Thesis

Country:China

Candidate:T Shen

Full Text:PDF

GTID:2348330542952872

Subject:Computer technology

Abstract/Summary:

There are various errors in electronic text,which has been a severe problem for researches.Manual correction is unable to adapt to the rapid growth of the number of electronic text.The use of automatic machines for text error checking and correction in English text become increasingly important.The current Grammatical Error Correction(GEC)methods including the method based on rules,the method based on N-gram model and the method based on parsing,which existing the following problems:Firstly,rule-based methods require building large rule base.While adding hard rules,there will be conflicting situations between the rules,which can greatly reduce error correction efficiency and accuracy.Secondly,the N-gram method does not address the problem of long distance dependent and data sparsing.The N-gram model can only describe the local connection in the sentence,and when the content of the sentence is longer than the N-gram length,the error correction algorithm loses the ability.On the other hand,while N-gram is long enough to solve the problem of long distance,the sparse matrix problem will also invalidate the algorithm.Finally,the parsing-based method cannot effectively correct local errors.In the case of certain local associations determining the usage of words,this method will ignore this connection.Aiming to solve these problems,this paper proposed an algorithm to grammatical error correction based on parsing and N-gram model.Here are major works:First,long sentences are divided into multiple clauses by the technique of dependency parsing.And then the probability of each of these clauses are got by N-gram model.In the end,the probability of each of these clauses is compounded to the probability of the long sentence.Secondly,combining LeftBigram,RightBigram,and Trigram to establish the N-gram model of the clauses.Finally,the error candidate set and the N-gram scoring method are adopted.The strategy is to calculate every wrong candidate instance of N-gram frequency in the corpus,the score of the instance is obtained by weighted sum frequency,finally get the highest one in the set.The experimental results show that the method based on parsing and N-gram model GEC is feasible and effective.

Keywords/Search Tags:

Grammatical Error Correction, N-gram model, parsing

Related items

1	Research On Grammatical Error Correction Based On Deep Learning
2	Research On Chinese Text Error Correction Based On N-gram And Dependency Parsing
3	Research On Word Error Correction Methods Of Chinese Text
4	Research And Implementation Of Grammar Error Correction Model Based On Deep Learning
5	Research On Error Correction Method Of Chinese Short Text Based On BERT
6	Research On Chinese Grammatical Error Correction Based On Sequence Generation Models
7	Research And Implementation Of Grammatical Error Correction Based On Recurrent Neural Network
8	Research On Automatic Error Correction Of Tibetan Grammar
9	Research On Chinese Grammatical Error Correction Based On Sequence-to-Sequence Model
10	Chinese Grammatical Error Correction Based On Knowledge Graph