Font Size: a A A

A Study On Similarity Of Student's Homework Text Under Certain Condition

Posted on:2007-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:X QuFull Text:PDF
GTID:2178360185462168Subject:Education Technology
Abstract/Summary:PDF Full Text Request
The purpose of the studies on Natural Language Processing is to find proper technologies to automatically understand and explain text content. Those technologies can also be used to analyze student's homework. Traditional NLP systems use rule-based reasoning methods. Recently, with increased computational capability of modern computers and emergence of huge amount of text information, researchers find that statistics-based text analysis is more effective. Nowadays, most of the text analysis methods are based on statistical theory.In order to use statistics based text analysis method, the first challenge is how to transform the text to machine readable format. The basic steps are: extract words, remove stop-words, determine sentence and paragraph boundaries, and convert text to vector which can be used by statistical analysis.After transforming text to vector format, we can compare similarities among those text using statistic methods. We can also perform clustering or categorization on them. However, text analysis is a very challenging task because of the high dimensionality (i.e. number of words) of the transformed vectors. So, it is very important to reduce the dimensionality of the transformed vectors.This study will utilize some well-known NLP techniques and data mining technologies to study the text similarity of student's homework. The main purpose is to understand and evaluate student's homework.This paper first gives some background and history knowledge about NLP and data mining. It then describes techniques about tokenization, feature...
Keywords/Search Tags:Similarity, Homework, Automatic word segmentation, Feature selection, Text similarity computing
PDF Full Text Request
Related items