Font Size: a A A

Semantic Web-based Text Classification Techniques

Posted on:2007-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2208360215498644Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the explosion of the information on Interact, it is impossible to manually classifythe entire Web without some form of automated aid. For this reason, automatic documentclassification has become an important research area.The information lacks a uniform semantic description, so it is increasingly difficult tofind, organize, access and maintain the information required by users. Ontology-basedSemantic Web technology proposed by W3C indicates a way to solve this problem. Webapplications can represent and understand the information by obtaining the semantic of thewords, even though classify document based on some roles.This paper begins with a introduction of Semantic Web and related technologies,followed by classification way of WEB page and the relation between Ontology andSemantic Web. At last Ontology-based Semantic Web technology automatic classifier ispresented. This classifier can classify Web pages with respect to the Dewey DecimalClassification (DDC) and Library of Congress Classification (LCC) schemes.When we describers the classifier, firstly, we explain how these Ontologies can bebuilt in a modular fashion, and mapped into DDC and LCC. Secondly, we propose theformal definition of a DDC-LCC and an Ontology-classification-scheme mapping. Thirdly,we explain the way the classifier uses these ontologies to assist classification. Finally, anexperiment in which the accuracy of the classifier was evaluated is presented. Theexperiment shows that our approach results an improved classification in terms ofaccuracy. This improvement, however, comes at a cost in a low overage ratio due to theincompleteness of the ontologies used.
Keywords/Search Tags:Ontology, Semantic Web, document classification
PDF Full Text Request
Related items