Font Size: a A A

Study And Implementation Of The Semantic Dependency Network Based Knowledge Extraction System

Posted on:2019-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:R LvFull Text:PDF
GTID:2428330572958996Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the advancement of science and technology,the Internet has gradually evolved into a huge distributed resource base.It is very difficult to obtain the target information accurately and quickly.In recent years,researchers have constructed some structured knowledge base to improve the efficiency of web resource query,such as Wikipedia,YAGO,Freebase,etc.However,due to the explosive growth of Internet resources,the knowledge extracted from semi-structured encyclopedia pages can no longer meet people's needs in deep knowledge query.Therefore,open domain knowledge extraction technology has become one of the most important research topics in many fields related to knowledge engineering.At present,the technology is still facing the problems of low result accuracy and low query hit rates,so it is necessary to design an efficient and complete knowledge extraction method.This thesis proposes a semantic dependency network of multilayer schemata structure.Through the unified modeling of unstructured texts on the Internet,a distributed parallel computing framework is used to extract knowledge and build a knowledge base quickly and accurately.Semantic dependency network can capture the complete and comprehensive semantic information of the original text.It firstly performs multi order semantic parsing on the original text based on the specific data structure “multi order semantic tree”,and annotates the lexical and syntactic information of each component of the text.Secondly,the semantic units extraction of each component is realized by the noun phrase partitioning technology.Finally entities are associated based on the original word order and syntax structure,and divided into different layers by establishing the similarity relationship and generic relationship between themselves,then the semantic dependency network is obtained.Therefore,the network can not only express the order relationship and syntax structure of the texts,but also show the conceptual abstraction level of the texts.In addition,semantic dependency network can be expanded horizontally and vertically based on external prior knowledge.This thesis uses Word Net and Wikipedia as standard external knowledge sources to support the process of semantic expansion.Through the semantic fluency detection based expansion verification,the potential information contained in the original text can be added to the network,so semantic dependency network has strong knowledge reasoning ability.This thesis designs a framework for distributed knowledge extraction and knowledge fusion based on semantic dependency network.Through the partition and traversal of semantic subgraphs of semantic dependency network,explicit and implicit knowledge contained in the network structure can be extracted easily.By designing the coreference resolution algorithm based on lexical similarity detection and the entity disambiguation algorithm based on context similarity detection,the entity link and entity equivalence judgment are achieved,thereby the redundancy and inconsistency of knowledge are eliminated,knowledge fusion is completed.Finally,Markov clustering algorithm is used to cluster the knowledge triples according to the relationship types,and the central knowledge is used to calculate the confidence of the triples within the cluster.After screening,a large-scale and high-quality knowledge base is generated.Finally,this thesis designs and implements a distributed knowledge extraction system based on the above technical scheme,which completes the fast knowledge organization,reasoning,extraction and fusion based on the semantic dependency network.After comparing the system performance test on NYT,Wiki and Reverb data sets with other advanced knowledge extraction systems,it is proved that our method can improve the accuracy of about 15%,and the average number of extraction results increases by about 1.0 times.
Keywords/Search Tags:Semantic web, Semantic dependency network, Knowledge extraction, Knowledge fusion, Intelligent system
PDF Full Text Request
Related items