Font Size: a A A

Research On Automatic Evaluation Method For Short Answer Questions In College Entrance Examination

Posted on:2017-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YangFull Text:PDF
GTID:2297330509457105Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The entrance exam is one of the most important nationwide, the quality of the papers are the key to determine the candidates score. The subjective questions in the college entrance examination include a simple answer, discussion questions, essay questions, and other types, and at present, all by the manual marking. But the process will be subject to the influence of many factors, such as marking on the judgment of the subjects of a comprehensive understanding of the degree, answer listed points clear degree, roll surface neat degree, and the marking workload huge, will cost of manpower and time. So we consider the possibility of using the Natural Language Processing method to explore the possibility of using a computer. This topic is essentially a research on the relationship between the student answer and the standard answer and the score of the student’s answer, and the matching between the text can be achieved through the establishment of a variety of different models. This article mainly from the following several aspects to explore the similarity between the student answer and the standard answer:First, the calculation of the text of the N-gram total accuracy rate, recall, and so is the basic idea. We analyzed and summarized the BLEU and N-gram using the ROUGE co current calculation method, and they are applied to the problem of automatic evaluation of the simple answer. We used the Spearman rank correlation coefficient to examine the correlation between the N-gram co-occurrence features and the data scores on the answer data set. Finally, through the traditional machine learning method-- supporting vector machine Ranking, the sorting support vector machine(SVM) is selected to get the best feature set which can make the model rank better.Second, we believe that N-gram alone is not enough. The shallow linguistic knowledge includes three aspects: lexical, syntactic and semantic, in which the lexical and semantic features need to be further explored. The text is made up of many different words, and the words of different parts of speech are also different in the sentence. Clearly, in the student answer and the standard answer between the word verb, noun may be more important than other parts of speech. So we will calculate the co-occurrence features based on part of speech. In addition, we can also extend the term for a specific historical related terminology, but also has a certain importance. Semantic similarity, we apply the method of calculating the similarity between query and document in information retrieval.Third, the depth of learning has gradually from the beginning of the calculation of the distribution of words that gradually developed into the calculation of phrases, sentences, text, etc.. One of the most basic applications of word vector is to compute the semantic similarity between the two words. Corresponding, we should also be able to give the semantic similarity of two sentences when we train the sentence vectors according to a complete corpus. We use the depth of learning in neural network method the students answer and the standard said to contain rich semantic information to the sentence vector and the similarity between vectors as the semantic similarity between the students answer and the standard.
Keywords/Search Tags:Short Answer Questions Automatic Evaluation, N-gram co-occurrence, part of speech, semantic similarity, deep learning
PDF Full Text Request
Related items