Automatic Proofreading Of English Text With Rich Information

Posted on:2022-04-17

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z H Liu

Full Text:PDF

GTID:1488306746956639

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Text proofreading is an important procedure for article publishing.It can provide text reviews for individuals,enterprises,and government departments,ensuring the accuracy and authenticity of the grammatical and semantic correctness of the published articles and preventing the spread of misinformation.However,text proofreading is meticulous work,and manual proofreading often faces some problems,such as omission and low efficiency.Hence how to automatically proofread texts at the grammatical and semantic levels is an important research problem in the NLP community.This work aims at the two core tasks,grammatical error correction,and fact verifi-cation,to automatically proofread English texts with pre-trained language models.This article integrates rich information,such as language knowledge,world knowledge,and specific domain knowledge,to further assist the text proofreading model to check the grammatical and factual errors in the text.To solve the problems in automatic proof-reading of English text with rich information,this work systematically carries out the following four studies.This article first leverages the grammatical error correction models to gener-ate grammatical error correction evidence for grammatical error detection models.This work compares the general language model pre-training methods and different pre-training strategies for grammatical error correction.Then this work determines the op-timal pre-training strategy for grammatical error correction models.Besides,this work further trains grammatical error correction models by filtering the training corpus that con-tains noise to further improve model performance.Finally,this work uses the well-trained grammatical error correction model to provide several grammatical error correction re-sults for the grammatical error detection model via beam search decoding to annotate the potential grammatical errors and assist grammatical error detection models.To integrate the text proofreading evidence from the grammatical error correction model,the world knowledge base,and the knowledge base of a specific domain,this paper proposes two models to fuse multiple proofreading evidence for the text error detection,the grammatical error detection model with multiple grammatical error correction results and the fine-grained fact verification model with multi-evidence reasoning to assist the two proofreading tasks,grammatical error detection,and fact verification.These two models consider the characteristics of proofreading at the grammatical level and the semantic level and design different methods to extract proofreading clues from rich information that can assist the pre-trained language models in text error detection tasks.Besides,our grammatical error detection model can further improve grammatical error correction models through quality estimation.To solve the problem of fact verification in the specific domain,this paper proposes the enhanced pre-trained language model to improve its language modeling ability and text reasoning ability in the specific domain.This method proposes two different continuous training strategies that train language models on the data of the specific domain to help language models learn the word semantics in the specific domain and improve the fact verification performance in the specific domain.

Keywords/Search Tags:

Text Proofreading, Rich Information, Grammatical Error Correction, Grammatical Error Detection, Fact Verification

PDF Full Text Request

Related items

1	Automatic Grammatical Error Detection Technology And Application For Chinese Text
2	Research On Error Correction Method Of Chinese Short Text Based On BERT
3	Research On Grammatical Error Correction Based On Deep Learning
4	Research On Automatic Detection And Correction Of English Grammar Errors Based On Machine Translation
5	Research On Statistical Analysis And Automatic Recognition Of Grammatical Errors In Modern Chinese
6	Research On Word Error Correction Methods Of Chinese Text
7	Research And Implementation Of Grammatical Error Correction Based On Recurrent Neural Network
8	Research On Automatic Error Correction Of Tibetan Grammar
9	Research On Chinese Grammatical Error Correction Based Sample Enhancement
10	Research On Chinese Text Error Correction