Font Size: a A A

Research On Domain Ontology Learning Based On Chinese Texts

Posted on:2020-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:B WangFull Text:PDF
GTID:2428330599953529Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Ontology,as an important branch of semantic network,plays an important role in information retrieval,question answering system and other fields.Ontology construction is a prerequisite for ontology application.Currently,there are two main methods to construct ontology.One is to construct ontology manually by ontology experts,the other is to construct ontology automatically or semi-automatically by means of statistics and linguistics,that is,ontology learning.Owing to the lack of flexibility and objectivity in manual ontology construction,ontology learning has gradually become the mainstream method in current ontology construction research.However,the traditional ontology learning methods have some problems,such as poor field portability,and there is relatively little research on ontology learning based on web text.Therefore,this thesis studies ontology learning based on Chinese web text.It mainly includes the construction of corpus and the extraction of ontological concepts,taxonomical relations and non-taxonomical relations,and improves the portability and performance of ontology learning methods.The main contents and achievements of this thesis are as follows:(1)This thesis presents a method for automatically constructing ontology learning corpus.Traditional ontology learning methods are usually based on existing tagged corpus,and it is difficult to use complex web text for ontology learning.Firstly,domain dictionary is acquired based on Knowledge Graph,and then domain vector space model of corresponding domain is constructed.Then,the web text is mapped to domain space vector based on TF*IWF*IWF algorithm,and the correlation between the text and the corresponding domain is calculated.Finally,the web text is filtered and pre-processed according to the correlation,so as to complete the construction of ontology learning corpus.(2)An improved D-TF-IDF algorithm is proposed to optimize the extraction process of ontology concepts.Because the traditional TF-IDF algorithm can not distinguish the importance of text to the corresponding domain,the improved D-TF-IDF algorithm regards the domain text relevance calculated based on the domain vector space model as the weight to enhance the sensitivity of the domain-related text.At the same time,the TF threshold is set to filter out the impurity words which are unique to the text but not related to the corresponding domain,so as to optimize the extraction effect of ontology terms.Finally,K-Means clustering algorithm is used to cluster ontology terms and disambiguate them,so as to extract ontology concepts.(3)An ontology taxonomical relation extraction method based on Knowledge Graph is proposed.In traditional extraction methods based on semantic dictionaries,traditional semantic dictionaries usually have poor portability and timeliness of domain updating.Therefore,firstly,the taxonomical relation template is obtained based on Knowledge Graph.Then,aiming at the low efficiency of taxonomical relation extraction method,a pruning algorithm is proposed based on Floyd algorithm to optimize the pruning of taxonomical relation template.Finally,the ontology taxonomical relations are extracted by combining the ontology concepts.(4)The extraction method of relational labels is improved to optimize the extraction process of ontology non-taxonomical relations.To solve the problem of low effectiveness of traditional extraction methods,a generic word-building rule template is used to decompose complex relational labels.Then,according to the relational label's relevance to the corresponding domain,relational labels are classified into domain verbs and general verbs,and then extracted by corresponding statistical methods.Finally,the concept pairs which are extracted by association rules are combined to complete the extraction of ontology non-taxonomical relations.Based on the above research contents,this thesis designs an ontology learning experimental framework,and compares it with the similar methods from two aspects of ontology itself and application.The experimental results show that the proposed ontology learning method can not only construct effective domain ontology based on Chinese web text,but also improve the precision of ontology concept extraction and the efficiency and effectiveness of ontology relation extraction to a certain extent.
Keywords/Search Tags:Ontology Learning, Knowledge Graph, Web Text, Concept Extraction, Relation Extraction
PDF Full Text Request
Related items