Font Size: a A A

Research On The Construction Method Of The Domain Ontology Based On Wordnet In English And Chinese Mongolian Three Languages

Posted on:2017-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:J Q YangFull Text:PDF
GTID:2348330485971363Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Lexical semantic web is the product of ontology research and application research. It takes the semantic information of the words as the independent nodes, and forms the multi relation network through the semantic relations between them. One of the more widely used lexical semantic web is WordNet. It was developed by Princeton University. And after many years of improvement, it contains the concept of information in almost all fields. At present, some semantic dictionaries are developed based on the WordNet's structure. Such as CCD, Euro WordNet, Mongolian noun Semantic Web etc. Because the integrity and widely used of WordNet, This paper mainly studies the method of automatic construction of domain ontology based on WordNet. And has completed the computer domain English Chinese Mongolian three language domain ontology mapping.In this paper, we first discuss the calculation method of conceptual information content and propose an improved information content algorithm based on the previous research. The traditional concept of information content algorithm can be divided into two kinds:one kind is based on the corpus statistics computation concept information content, other kind is based on the WordNet own structure computation concept information content. In order to obtain a stable information content value, the former method requires almost infinite corpus. The latter ignores words'number in every concept. In this paper, we propose a algorithm to calculate the information content, based on WordNet's structure and take into account the number of each concept. Through experiments, the method has certain advantages compared with the traditional method.In order to make the domain ontology constructed with a higher accuracy, this paper designs a new semantic similarity algorithm based on WordNet. The algorithm is better than the traditional algorithm not only considers the information content of concept, but also taking into account the path of the two concepts and those common parent depth. After experimental comparison, the proposed algorithm also improves the performance of semantic similarity computation.In this paper, the process of building the domain ontology is:First of all, according to the guidance of the field experience, the field is divided into several sub fields. And each sub domain is given a core concept based on field experience. Then get the top level concept of each sub domain through the core concept and semantic similarity algorithm. And getting an initial set of concepts for each sub domain based on the top level concept. Then, designing the construction algorithm based on the split clustering technique in data mining. And removed inappropriate concepts from the initial concept set. Finally, the semantic relations are obtained from WordNet to achieve the goal of building the domain ontology.Constructing the computer domain ontology is not only a test of the method of this paper, but also the research object of this paper. An English Chinese Mongolian three language mapping platform is designed after obtaining the English computer domain ontology. Through this platform and the daerhan dictionary and the WordNet database, completed the mapping of Chinese and Mongolian computer domain ontology. Got the English Chinese Mongolian three language domain ontology of computer field. And the domain ontology is matched with WordNet.
Keywords/Search Tags:WordNet, Ontology, Information content, Semantic similarity, Automatic Construction, Computer domain
PDF Full Text Request
Related items