The Research On Distinguish Measure Of Repetitive C Language Test Questions In Database Based-on The Tree Structure Of Domain Ontology

Posted on:2016-08-03

Degree:Master

Type:Thesis

Country:China

Candidate:X D Yan

Full Text:PDF

GTID:2308330461977075

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The online-test system of C program language is based on examination database. Because the original system lacks of duplication checking module, it is hard to avoid similar questions in examination database.Consequently, the quality of test paper and effect of examination would decrease. So, how to quickly and accurately find these similar test questions is what this paper would like to do.The duplicate checking of C test questions belongs to the similarity calculation in NLP. After study large amount of researches on the similarity calculation, this thesis would like to solve this problem in three procedures, they are word segmentation, word similarity calculation and sentence similarity calculation.In the aspect of segmentation, this thesis chooses ICTCLAS tool which is highly practical and reliable. It’s easy to extend original dictionary and part of speech. In word similarity calculation procedure, firstly, this thesis studies some knowledge system, such as "Chinese Thesaurus", "How Net" and "domain ontology". Then, domain ontology of C program language is constructed. Finally, "domain ontology" and "how net" are used to count the similarity of conceptions. In domestic, for sentence similarity calculation, there are many relative methods based on word sense, word order and syntax features. As the C similar test questions have less word changed and have fixed word sequence, this thesis selected "Levenshtein Distance" algorithm to calculate sentence similarity.In general, firstly, ICTCLAS is selected to split words and mark on part of speech. Secondly, C domain ontology is used to calculate domain word similarity. Lastly, "Levenshtein Distance" algorithm is used to count sentence’s similarity, in which the operation costs are different with each other because of the different parts of speech. Experiments show that these methods are very effective and accurate in identifying similar C test questions, so, the problem is solved basically.

Keywords/Search Tags:

Domain Ontology, Levenshtein Distance, Examination Database, Duplicatiout Cheeking

PDF Full Text Request

Related items

1	Database Watermarking Based On Text Format
2	Research On Automatic Examination Generating System Based On Ontology
3	Research Of Domain Ontology Automatic Construction Method Based On Relational Database
4	Research Of Domain Ontology Storage On Database
5	The Research Of Building Domain Ontology Semi-Automatically Based On Relational Database
6	Study On The Theory And Practice Of Ontology And Ontology-based Agricultural Document Retrieval System--Floricultural Ontology Modeling
7	Construction And Application Research Of Ontology
8	Construction And Application Of Knowledge Base Of Technology Transfer Field Based On Ontology
9	Construct Research On Logistics Domain Ontology Relational Database
10	Research On Domain-based Question Answering System Based On Ontology