Font Size: a A A

Research On Automatic Text Categorization Ontology-Based

Posted on:2010-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2178360272499440Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet, the number of documents on the networds increases exponentially. One important research focus on how to deal with these great capacity of documents. Text classification can handle a large number of text, resolve the current situation about the disorder information by a large extent, and accurate positioning needed user-friendly.Text classification has a broad application prospects as the technological base on the field of information retrieval,information filtering, search engines, text database,digital libraries and so on.The traditional text classification use the vector space model, and is independent and no semantic connection between the supportive words and words which is related in reality text,such as: synonym words,active verb ,verb to be and so on.In the recent years, the concept of ontology has been widely used in the field of computer science and technology especially in the field of information retrieval. Since ontology is not only a collection of various concepts, but also reflects the semantic connection between each concepts in the ontology model. So it is significant and valuable to apply ontology in the technology of text classification.This paper brings forward a text classification model based on ontology. This model combined the concepts in the ontology with the characteristic words in the traditional text representation model, and defined the concept of concept characteristic word and key special concept word. Replacing the characteristic concept word with the concept characteristic words in the text model, we can express the text with ontology representation model, which contains the important concepts and the exact semantic connections between concepts. In this way, we can express the texts clearly and exactly. On the other hand, according to different model in the text express, we propose different ontology matching algorithm correspondingly. The matching algorithm not only consider the different weight of concepts in the ontology model, but also takes the fully hierarchy connection between concepts into considerations. This model partly solves the flaws and weakness of the traditional method of text classification. By theory comparing with the traditional model and analyzing experiment data, we verify that text the classification model based on ontology is feasible and more efficient.
Keywords/Search Tags:Ontology, Text Categorization, Ontology Matching
PDF Full Text Request
Related items