Font Size: a A A

Research On Automated English Essay Scoring Using Text Categorization

Posted on:2010-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2178360275958667Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Chinese English Tests(CET) are large-scale standardized tests which held twice a year in China.Because of the heavy workload of examination marking,the accuracy and objectivity is difficult to be guaranteed,especially as the number of students increased sharply,the problem becomes more serious.At present,the automated examination system is applied to the filed of the objective questions,such as choice questions,gap filling and so on.So,the workload of examination marking reduced greatly.However,automated essay scoring is still to be researched.Based on the following research,such as factors that affect the scoring of composition, feature extraction methods,text categorization and other correlative work,we propose a research method of automated essay scoring using text categorization.First of all,we extracted the essays of the same theme from the "Chinese Learner English Corpus" for our research.According to the different scores of essays,we divide essays into different categories.Secondly,we extract the features of contents and linguistics from the essays,then the vector space model is built by those features.Content-based features include words and phrases are extracted by document frequency,information gain andĪ‡~2 statistic methods to set up different thresholds.Linguistics-based features include the superficial linguistic features(such as:the number of essays' words,the number of sentences,word lengths,etc) and complex linguistic features(such as:syntactic structure, part of speech,etc).Thirdly,according to the experimental results,we choose appropriate essays' features and classification methods,and then classify the test essay to appropriate category using different classifiers,such as Naive Bayes,K Nearest Neighbors and Support Vector Machine.Finally,according to the output of classifiers,we combine the component classifiers by the techniques of the voting and stacking,and then get the final result by combined classifiers.The experimental results show that combination is an effective method to enhance the performance of component classifiers.And the method of automated essay scoring using text categorization is feasible.
Keywords/Search Tags:automated essay scoring, feature selection, text categorization, classifiers combination, vector space model
PDF Full Text Request
Related items