Research And Design Of The Automated English Essay Scoring Algorithm

Posted on:2019-07-14

Degree:Master

Type:Thesis

Country:China

Candidate:H K Liu

Full Text:PDF

GTID:2428330542997757

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the popularization and application of artificial intelligence technology in various fields,the field of automatic scoring of English composition has also received great attention and development.However,there has been no great breakthrough in the representation of textual content.The representation of traditional textual content is mostly based on latent semantic analysis technology,while latent semantic analysis technology can only extract thematic information,and the information of words will be ignored.Therefore,this paper proposes a text content representation method based on word vector clustering and a text content representation method based on the vector space model.It not only can fully characterize the meaning of the word text,but also takes into account the degree of compliance of the composition of the article,and on this basis,this paper develops an automated essay scoring algorithm based on word vector and multi model fusion.In order to better characterize text content,this paper proposes a text content representation method based on word vector clustering.First,the word2vec model is trained using the Wikipedia English corpus.Then the trained model is used to generate the word vector of the text to be tested and aggregated.And the statistical information of the corresponding word under each category is used as the content text feature.In addition,the text content representation method based on the vector space model is used to judge the degree of conformity of the students' writing essay.The keywords of the text are extracted by the vector space model,and on the basis of this,the theme-related feature is generated.In addition,this article uses lexical features and syntactic features as non-textual features to evaluate the quality of articles from the perspective of words and sentences.Then,using the previously extracted text features and non-text features,linearly fuse the prediction results of the three machine learning models(Random Forest,GBDT,XGBoost)as the final prediction results.Finally,this paper validates the effectiveness of the model by using the Automated Essay Scoring data set on Kaggle,an international data mining competition platform.After verification,the quadratic weighted Kappa value of the prediction results of test set data over the automated essay scoring algorithm based on word vector and multi model fusion proposed in this paper is better than that of the first place in the international Automated Essay Scoring competition on Kaggle,which verifies the effectiveness of the algorithm.

Keywords/Search Tags:

Essay Scoring, Word Vector, Clustering, Vector Space, Machine Learning

PDF Full Text Request

Related items

1	Research On Automated English Essay Scoring Using Text Categorization
2	Research And Design Of Automatic English Essay Scoring Algorithm Based On Machine Learning
3	Research On Some Problesm Of Support Vector Machine Learing Algorithm
4	Text Classification Based On Word Vector And Topic Vector
5	Research And Implementation On Part-Of-Speech Tagging In Automatic English Essay Scoring
6	Design And Implementation Of Automated Scoring System For English Non-essay Writing Questions
7	Research On Automated Essay Scoring Method For Junior High School English
8	The Key Technology Research On Automated Essat Scoring
9	The Design And Implementation Of Scoring-assistant System For Chinese Essay-type Questions
10	Improved Vector Space Model And Its Application To Document Classification System