Font Size: a A A

Ontology-based Web Text Classification

Posted on:2009-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2178360242989805Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Traditional text classification methods mostly use term-frequency to denote the text, and classify the text by calculating the term weight in Vector Space Model, so it can not apply the useful semantic information to its classification process, the denotation of the text is only a set of words without any semantic information. In order to overcome the limitation of the classic text classification methods, and to make full use of the semantic information in the text to help the classifying process, this paper introduces WordNet, denotes the text with lingual knowledge and proposes an ontology-based web document classification algorithm together with its system framework. In this algorithm, we take in consideration of semantic information and make use of WordNet additional with other ontology related methods to construct the classifier, calculate the similarity of the property value for different abstract hierarchy, improve the classic similarity-calculating method which uses only the static information from the data. This method combines the static information with semantic relation between concepts, simulates the real world more concisely, try to find out the implicit principle or module, so the result is more like the understanding process of human-being and at the same time a better accuracy, at last we prove its effectiveness using experiments.
Keywords/Search Tags:Ontology, WordNet, Text Classification, Semantic Web
PDF Full Text Request
Related items