Font Size: a A A

Research And Design Of Automatic English Essay Scoring Algorithm Based On Machine Learning

Posted on:2022-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:K Y GuoFull Text:PDF
GTID:2518306326483464Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Automatic scoring of English essays has received increasing attention as a solution to the problems of subjectivity,time-consuming and slow feedback in traditional English teachers' scoring of essays,as well as the short time and heavy task of scoring in large English examinations.since the first English essay scoring system was introduced in 1966,its research has become more detailed and task-oriented with the development of natural language processing techniques and machine learning.The study of methods for single-dimensional feature extraction of essays and the improvement of overall scoring model performance has become the focus of research on automatic scoring of English essays,with progress in both areas being relatively slow due to the mutual constraints of current annotation data size and research methods.This thesis proposes a new approach to address the problems of the composition tangency feature extraction method and an improved method to address the limitations arising from the automatic English composition scoring method.The main work is as follows:1.To address the problem that the existing methods in the current research on composition tangency dimension ignore the fine-grained semantic information of composition content under the extraction of full-text semantic information and ignore the global semantic information of composition under the extraction of key information of composition;and the technical non-equivalence between the existing commercial English composition scoring system research and the publicly available English composition scoring research.This thesis proposes a method for calculating English composition tangency scores based on the richness of topics within the composition.Firstly,a topic segmentation model is constructed using the idea of model migration,and the composition is segmented in topic granularity,then a semantic vectorized representation of the composition in topic granularity is made,and finally,the a-ave score formula proposed in this thesis is used to obtain the tangency score of the composition.Experiments on the ASAP dataset show that the tangency score calculation method proposed in this thesis is consistent with the results under the combined influence of multiple variables,indicating the validity of the method proposed in this thesis.2.The problem of the limitations of different composition scoring methods caused by the difficulty of extracting deep semantic features by manual feature-based English composition scoring methods and the difficulty of extracting shallow features such as word count by neural network-based English composition scoring methods is addressed.This thesis proposes a combination of manual feature extraction methods and deep learning methods for scoring English essays,drawing on existing methods and combining them with the tangency score calculation method proposed in this thesis.The method uses manually designed features to extract shallow features at the word and sentence levels within the composition and uses the tangency score calculation method proposed in this thesis to extract semantic deep features within the composition while drawing on existing methods to extract semantic features of the composition,and regressing the deep and shallow features to obtain the total score of the composition.Experiments using Pearson's evaluation metrics to measure the correlation between the predicted total score of the essay and the true total score under the combined method showed that the Pearson's mean value reached 0.815 on the ASAP dataset,the Level 4modal machine assessment dataset and the batching.com machine assessment dataset,which was 0.068 and 0.17 respectively compared to the mean values of 0.747 and 0.645 for the baseline models such as Bi LSTM and RNN.and 0.17,demonstrating the effectiveness of the method proposed in this thesis.
Keywords/Search Tags:Automated Essay Scoring, Feature Engineering, Adherence Prompt, Deep Learning, Model Transfer, Natural Language Processing
PDF Full Text Request
Related items