Font Size: a A A

The Research Of Chinese Ontology Learning Based On Web Mining

Posted on:2008-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2178360242458943Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Ontology is playing more and more important roles in knowledge management and the Semantic Web, but the construction and maintenance of ontology are becoming the bottleneck for those application. At present, there are few ontologies built manually such as WordNet and CyC, but on the one hand, building ontology manually is a time and manpower-consuming job, on the other hand, the common ontology contains few concepts. Because the knowledge captured in ontology is evolutive and regenerate, the ontology must not be built manually as the dictionary building to avoid the ontology built becoming outdated information when it is issued. Also, it is a time-consumed job for ontology engineer to maintain. So how to build and update ontology automatically or semi-automatically is becoming an important research theme for artificial intelligence, text mining and information retrieval. Being the foundation of the semantic web, fast construction of ontology makes good sense for its application and development. In order to solve the knowledge bottleneck problem in ontology engineering, we need tools building ontology automatically or semi-automatically.The ontology learning is a hotspot at present, and it aims at assisting ontology engineer by developing the technology of building ontology automatically or semi-automatically. The main task of ontology learning is consists of automatic of semi-automatic acquisition of every element contained in ontology. For the moment, there are some existing tools for ontology building automatically overseas such as OntoLearn, Text-to-Onto and so on. Although these tools support automatic or semi-automatic ontology building from structured, semi-structured or non-structured documents, they all depend on the common lexicon or core ontology in different extents. And, there is not much research on the Chinese ontology learning at home correspondingly. Also, there isn't a tool supporting Chinese ontology learning until now. The main intention of our research is extracting domain terms automatically from Chinese web documents using the technology of knowledge acquisition to reduce the consumption in building ontology. The ontology obtained in this model is not only limited in logic but also its semantic description should be used conveniently by computer, so its task is using the computer to express the knowledge easily which is defined and shared together.According to the shortage of methods in existence, the paper, based on the techniques of frequency analysis and semantic analysis, introduces shallow semantic analysis in addition to taking full advantage of the character of semi-structure of the web pages. At the same time, the model is independent of domain lexicon. After using the tool of ICTCLAS to split words and label parts of speech primarily, the model adopts mutual information to calculate the internal associated strength of the Chinese string to extract the candidate term and chooses the concepts of domain ontology from the set of terms extracted based on the abundant domain corpus material and the consistent filter rules. The paper makes the best of the technology of natural language process and statistics in order to make the algorithm simpler and faster, and improve the speed and precision of concept extraction. Finally, the paper adopts technology of rules based and syntactic analysis to extract the relation of concepts so as to improve the precision of relation extracted and make the relation measurable.In conclusion, the paper make the discussion and prospect for future work based on the core of the model-Chinese ontology learning based on web mining.
Keywords/Search Tags:Ontology, Ontology Learning, Mutual Information, Syntactic Analysis
PDF Full Text Request
Related items