Font Size: a A A

The Research Of Term And Relation Acquisition Methods For Domain Ontology Learning

Posted on:2014-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:L S LiFull Text:PDF
GTID:1228330395998676Subject:Knowledge management
Abstract/Summary:PDF Full Text Request
Domain ontology has been applied to many fields, such as knowledge engineering, artificial intelligence and so on. It plays a key role in the knowledge management of enterprises, especially in the discrete manufacture enterprises that take the products as the core and whose knowledge property lies in the procedure of products and development. It is necessary to reconstruct product knowledge models for knowledge management in manufacture enterprises. Because the ontology can describe problems formally, provide normalized and uniform presentation forms, and offer models for knowledge sharing and reusing, domain ontology is introduced into the discrete manufacture enterprises for knowledge management. However, to construct domain ontology by domain experts will spend a lot of time and labor. Therefore, researchers have paid more and more attention to the research of constructing domain ontology automatically or semi-automatically.Term and relation acquisition are important for domain ontology learning. This dissertation focuses on this research to improve the validity in constructing domain ontology automatically and provide effective methods for enterprise knowledge management.This dissertation mainly concentrates on the following points to investigate the term and relation acquisition from unstructured texts:(1) An unsupervised term extraction method based on information entropy and word frequency distributed variety is proposed. This method combines the information entropy and word frequency distributed variety and applies simple linguistic rules to filter character strings. The result shows that the method is more effective for extracting the terms with low frequency and can obtain the whole term structure.(2) The domain term extraction method based on Conditional Random Field (CRF) combined with active learning strategy is proposed. On account of the poor performance of the unsupervised method and the expensive cost to obtain high-quality corpora in supervised method, this dissertation introduces the active learning strategy into the term extraction system based on CRF. The active learning method uses the uncertainty-based sampling strategy and selects samples using conditional probability given by CRF model. The active learning strategy can obtain better performance by less labeled corpus. (3) A multi-strategies method is proposed for term relation extraction due to the diversification of term relation types. This dissertation focuses on the acquisition of synonymy relation and hierarchy relation. The rule-based method, the statistic-based method and the cluster-based unsupervised method for different relation types are integrated. It has achieved better performance for the hierarchy relation.(4) This dissertation also puts forward the strategy to integrate the composite kernel method and the distributed meta-learning for Chinese entity relation extraction. Experiments are carried out based on the news field corpus. This method adopts the distributed meta-learning strategy, which uses the composite kernel that combines feature-based kernel and sentence structure based kernel. The result shows the F-score for entity relation extraction is improved nearly3percent.The proposed approaches are validated by an instance of constructing the automotive field ontology. Experiments show that the approaches for extracting terms and relations from texts in this dissertation are efficient and support the semi-automatically construction of Chinese domain ontologies. In addition, these methods can be applied to other fields, such as dictionary compilation, text summarization.
Keywords/Search Tags:Domain Ontology, Knowledge Acquisition, Term Extraction, RelationExtraction
PDF Full Text Request
Related items