Font Size: a A A

A Research Of Essay Automatic Scoring System Based On Natural Language Processing

Posted on:2016-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2348330476455325Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Automatic essay scoring system is a system for scoring the English essay using computer technology. The system is a complicated system which using the statistics, Natural Language Processing, linguistics and information retrieval technology. Currently essay automatic scoring systems like E-rater have been used widely. But in China the research of automatic essay scoring system is still in the initial stage at present. With the development of the online education, the measurement of the knowledge mastery needs the automatic tools. The traditional teacher manual correction becomes more and more difficult with the rise number of students who learn online. In contrast with manual scoring, automatic scoring system scores more quickly, more fair and more economical.Firstly the paper develops the automatic essay scoring system based the open source project in the EDX. The essay-scoring task is a text classification task in this system. The system uses the gradient boosting decision tree classifier. But the system is not perfect whose features cannot reflect the composition fully. What's the worse, the expansibility of system is not good. In order to add new prompt, need new train and test data, which are need for training new scoring model. Currently the outstanding foreign systems take consideration of writing quality, semantic content and discourse structure automatically. And the number of grammar error is an important factor of the measurement of the writing quality. So the emphasis of the research moved to the grammatical error correction.The paper then develop the grammatical error correction system based the language model. The language model server built by the SRILM tool can be used for looking up the N-gram phrases probability in this system. The candidate word sets for every word in the sentences can be got according to the word stem. Then the most likely words combination can be obtained using the Viterbi algorithm. If the combination is different from the original sentences, the grammatical error is detected and corrected. But the system can only detect the replacement error. But it cannot detect the insertion and deletion errors. The precision and recall of the system is not very high.At last the paper do research in article and preposition error detection and correction for they are the most often errors made by the English learners. The train sets are extracted from the BNC corpus, which can be regard as the clean corpus without grammatical error. So there are no error samples in the initial training process. In order to add error information, the paper added the artificial error, which can improve the classification model sensitivity for error. The system views the grammatical error correction as the process of classification. The maximum entropy classifier is adopted in this system, which has strong ability of classification for the sparse feature. From the experimental results, articles and prepositions error detection system achieved considerable results with foreign universities. And the paper points the future research content: semantic analysis and a wider variety of grammar error detection.
Keywords/Search Tags:automated essay scoring, grammar error detection, language model, maximum entropy classifier
PDF Full Text Request
Related items