Font Size: a A A

Research On The Semantic Coherence Of Texts Oriented To Composition

Posted on:2022-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:M H XiaFull Text:PDF
GTID:2518306572450934Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid and efficient development of computer intelligence,many teaching aids continue to change,and many intelligent teaching software for students' composition have also received widespread attention.Discourse semantic coherence is an important factor to measure whether the form of sentences in a text is cohesive and whether the semantics are smooth.It is an important dimension in automatic composition scoring and an important research direction in comp osition scoring technology.This topic is oriented to the task of discourse semantic coherence in the field of composition.According to the structure and semantic characteristics of the composition,paragraph information is integrated into the text-level semantic information extraction,and it is combined with the sentence order task to in-depth study of the semantic coherence of the composition And try the pre-training task of sentence order to improve the effect of scoring the semantic coherence of the composition.First,this paper proposes a text-level semantic vector extraction technology that combines paragraph information.In the process of sentence-level vector generation,a pre-trained model is used to generate sentence embedding vectors.Then the sentences of the text are embedded in the vector sequence through the structured Transformer model,so that the context information is integrated into the vector representation through the self-attention mechanism,and the paragraph structure in the text is expressed as a matrix,as the component prior matrix as the structure Transformer model provides paragraph structure information,so as to obtain the text embedding vector.The experimental results show that the text embedding vector fused with paragraph information can effectively improve its effect on the classification of text coherence.Secondly,this paper proposes a discourse coherence model that integrates sentence ordering tasks.This model trains the task of sentence ordering while training the task of discourse coherence.The model samples the sentence pairs by randomly disrupting the sentence order of the text,and trains the model to be able to distinguish whether the order of the sentence pairs in the sample is consistent with the relative order in the original text.At the same time,in the model training process,a multi-task optimization method based on gradient normalization is used to improve the model training effect.Experiments show that the sentence ordering task as an auxiliary task has convergence and it can make up for the defect of the Transformer model's sinusoidal position coding in the coherence task,and the model has achieved the best current effect on the English TOEFL data set.Finally,this paper proposes a sentence pre-training model based on sentence ordering tasks to achieve domain adaptation for discourse coherence tasks.This method uses the large-scale middle school composition corpus collected in this research,and pre-trains sentence vectors with sentence ranking pre-training tasks,so that it can obtain the context information and sentence relationship information of the sentences in the text.Experiments show that this method has achieved good results on the Chinese composition data set,and can solve the problem of less data and difficulty in labeling for the task of discourse coherence to a certain extent.
Keywords/Search Tags:Semantic coherence, Text vectorization, Multi-task learning, Domain adaptation, Transfer learning
PDF Full Text Request
Related items