Font Size: a A A

Research And Application Of Learning Methods For Knowledge Graphs

Posted on:2021-01-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P ShengFull Text:PDF
GTID:1368330611977308Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the in-depth development of cognitive intelligence technology,knowledge graph has become a significant knowledge representation form in the era of big data.In many vertical fields,knowledge graphs have been put forward to contribute to various practical application scenarios including data analysis,smart search,intelligent recommendation,and natural human-computer interaction.At the same time,as the backbone of achieving cognitive intelligence for machines,knowledge graph is also a hot research topic in the field of artificial intelligence at this stage.This dissertation further focuses on several vital theoretical issues for supporting knowledge graph construction and intelligent applications to carry out key technical research and empirical analysis.Automatically relation extraction from the open-domain environment can be viewed as a foundation when building a knowledge graph,accurately identifying the hypernym-hyponym relation is an essential foundation for the extension of knowledge hierarchy system in vertical direction.Representation learning of the knowledge graph is a significant way to realize the numerical representation of the knowledge graph so that a machine can process and apply the knowledge graph rely on its representation for knowledge related computation.It is a great challenge how to construct the knowledge graph based on the text corpus.The research contents and main contributions of this dissertation can be summarized in the following aspects:Firstly,to address the issue of open-domain relation extraction,we propose an open relation extraction model based on syntactic analysis.This model adopts a rule-enhanced syntax analysis method to improve the analysis capacity of the structure of sentences,so that we can obtain the more triples with high-quality relational phrases.Furthermore,a measurement of relation strength is proposed to further select representative triples which have greater relation strengths as the final extraction results.Experiments on four real-world open domain datasets show that our method is unsupervised and automated,and enabled to adapt heterogeneous text corpus in a certain scale.Compared with several representative baseline methods,our model improves the performance of open relation extraction.Secondly,to address the issue of accurately identifying hypernym-hyponym relations in the knowledge graph,we propose a definition-driven hypernym-hyponym relation prediction model which makes full use of two high-quality external knowledge bases,i.e.,Wordnet and English wikipedia,respectively.This provides primarily textual definition evidence for two terms as candidates in the hypernym-hyponym relational triple.On the one hand,this model introduces high-value textual knowledge to extend the semantic context of terms for making up for the limitations of the existing methods that learn the term embeddings from the context of training corpus with insufficient characteristics and domain independence.Furthermore,it is helpful to provide a richer interpretation of the specified domain or ambiguous candidates.On the other hand,this model can jointly model(term,definition)to mine the implicit features of the hypernym-hyponym relation by their semantic contexts.Finally,this model employs an end-to-end training to sidestep the limitations of the traditional predictive models that first learn the term embedding representations and then learn the binary classifier,so as to make more effective use of training data.The experimental results show that our model is consistent in performance and generalization and outperforms several competitive baseline methods on both open and specific domain datasets.Thirdly,to address the completion and correction problem of missing links in the temporal-aware knowledge graph,we propose a two-phase temporal-aware knowledge graph completion model called TKGFrame based on prior advanced work for the task.TKGFrame presents three extensions in the following: Firstly,refine a new temporal evolving matrix for better modeling evolving strength representations of pairwise relations pertaining to the same relational chain following the timeline.Secondly,based on the embeddings of the temporal knowledge graph,a plausibility measure of missing temporal facts can be formulated as a constrained optimization problem,and propose an integer linear programming approach to solve it as well as avoid implausible predictions from the embedding results.Thirdly,integrate the above two models into the proposed TKGFrame framework seamlessly.The experimental results demonstrate that TKGFrame significantly outperforms several state-of-the-art baseline methods on both entity prediction and relation prediction tasks in three real-world temporal knowledge graph datasets.Finally,new articles appearing online,which usually comes from specific events or topics,has become a significant source of harvesting information from the web for individuals.Unfortunately,in the real world,users tend to be submerged in the news reports are rapidly accumulating and redundancy,and contents are diverse,so that they are incapable of effectively perceiving and grasping the representative facts in the news stream.To a better explain the above phenomenon,we propose a conceptual knowledge graph construction system called MuReX.This system relies on a series of complete and practical techniques,including an extraction strategy combining multiple extractors,a two-stage candidate triple filtering approach based on an improved self-learning framework,a compatibility measure of facts,an important measure of facts,and a heuristic strategy for knowledge graph construction.Finally,these techniques are integrated into the unified TKGFrame framework seamlessly,resulting in a high-quality conceptual knowledge graph containing significant facts through five major modeling procedures,namely data preprocessing,candidate fact extraction,topic coherence estimation of candidates,compatibility measure of facts,and conceptual graph construction.Based on the MuReX,users can quickly discern salient and meaningful facts and the advances of events under specific topics as well as explore potential and new connections.
Keywords/Search Tags:Learning method for knowledge graphs, open-domain relation extraction, hypernym-hyponym relation identification, link prediction, conceptual knowledge graph construction
PDF Full Text Request
Related items