Research On Chinese Grammar Error Correction Based On Deep Learning

Posted on:2022-12-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y Feng

Full Text:PDF

GTID:2518306746468754

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the Internet age,massive amounts of text data are generated all over the world at every moment,which is mixed with a lot of wrong information.If it is not proofread,these wrong data will have a great impact on the follow-up work.Conventional manual proofreading has been unable to keep up with the speed of today's text generation.With the development of deep learning and natural language processing technology,academia and industry have carried out text error correction research.Text errors can be divided into superficial and deep,spelling errors and punctuation errors belong to the former,and grammatical errors belong to the latter.Shallow errors can be corrected by rules and language models,while traditional machine learning-based correction methods are unsatisfactory in the face of deep errors.It can be seen that deep grammatical error correction is the core and difficulty of current text error correction technology.For this reason,the current main direction of text error correction research is based on deep learning,using neural network models to train large-scale grammatical error correction tasks.This paper first introduces and summarizes the research status of text error correction,and then proposes a feasible Chinese grammar error correction method for grammar error correction based on the existing deep learning-based automatic error correction methods for Chinese texts.The main work of this paper is as follows:(1)The research background and significance of text automatic error correction technology are expounded,the research progress of Chinese and English text error correction is analyzed and summarized,and the related work is introduced.(2)For Chinese grammar error correction,an error correction model based on the UniLM model framework is proposed,which uses words as the granularity,initializes the model with pre-trained model parameters,and fine-tunes the model under specific corpus training.(3)Build a framework based on the UniLM+CRF model to realize the task of marking grammar errors in Chinese texts,and mark possible grammar errors in the text.(4)Build a seq2 seq model framework based on UniLM,realize the task of Chinese text grammar error correction,and correct possible grammar errors in the text.(5)Preprocess the corpus data.This paper uses the public data set provided by NLPCC 2018 shared task 2-Grammatical Error Correction(GEC)to clean,denoise,segment,remove stop words and other operations.Improve the quality of datasets and help improve model training accuracy.(6)Propose a method for generating grammatical error labeling task samples based on editing operation set,which provides label data for labeling task training samples.Among them,grammatical errors are divided into four categories: S(replacement word),R(multiple word),M(less word),W(out of order),and grammatical errors in the text are marked according to the category.(7)Analyze and summarize the experimental results,and use Precision,Recall,and F-value indicators to evaluate the performance of the model.This paper uses the public scorer to calculate the corresponding evaluation indicators.

Keywords/Search Tags:

Grammar Error Correction, Transformer, UniLM, BERT, Attention Mechanism, pre-training

PDF Full Text Request

Related items

1	Research On Chinese Text Sentiment Analysis Based On Transformer And BERT Model
2	Research On Text Classification Based On Self-Attention Mechanism
3	Research And Implementation Of Grammar Error Correction Model Based On Deep Learning
4	Research Of Chinese Text Correction Based On Neural Machine Translation
5	Research On Language Model Rescoring And Error Correction Of Transcription Results In Chinese Speech Transcription
6	Research And Design Of A Kind Of Hierarchical Language Model For English Grammar Error Correction
7	Research On Chinese Text Error Correction Method Based On Deep Learning
8	Chinese Grammar Error Analysis Based On Deep Learning And Its System Implementation
9	Research On Improved Text Representation Model Based On BERT
10	The Research Of Grammar Error Correction