Font Size: a A A

Distributed Representation Of Of Knowledge Graphs

Posted on:2020-11-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:X HanFull Text:PDF
GTID:1368330575956433Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Knowledge graphs are giant semantic networks composed of nodes as entities and edges as relations,representing the semantic association be-tween different nodes in the form of large-scale graphs.As the structured storage form for the real world data of multiple areas,knowledge graphs provide the possibility for artificial intelligence(AI)to make better use of data.At present,AI is gradually evolving toward cognitive intelligence.Cognitive intelligence will no longer be content to the results learn from big data with statistical machine learning,but more concerned with the in-terpretability of the learnt results and the knowledge contained in big data.How to realize the transformation of data into knowledge has become an urgent problem to be solved by cognitive intelligence.To this end,based on the development of traditional knowledge engineering and benefiting from the accumulation of massive data in the age of Web3.0,knowledge graphs empower cognitive intelligence with its characteristics of large-scale,in-terpretable and inferior,and have been widely used in intelligent search,automatic question answering and interpretable recommendation and so on.That is to say,knowledge graph is an important engine to promote the development of AI to cognitive intelligence.Different fr-om the knowledge bases established by traditional knowl-edge engineering,knowledge graphs are often large and sparse,which brings challenges to the representation of knowledge graphs.Knowledge bases often use symbolic representation.In this case,each node and edge are represented as unique symbols.This symbolic representation can clear-ly identify nodes and edges.However,if used in the representation of knowledge graphs,the following problems will arise:first,it cannot adapt to the growing scale of knowledge graphs;second,it cannot measure the semantic correlation between nodes;third,it restricts the application of knowledge graphs in other AI fields.In recent years,inspired by distribut-ed representation in NLP area,the distributed representation of knowledge graphs provides solutions to the above problems.To this end,this thesis will focus on the distributed representation of knowledge graphs,and have made some achievements in modeling the local and global structural char-acteristics of knowledge graphs.Knowledge graphs use triples as the basic storage unit.This thesis ob-tains the local structural characteristics of knowledge graphs by learning the internal constraint characteristics of each triple.Considering the good learning ability of neural network in word vector learning,and the flexi-bility and self-learning of modeling the interaction relationship,we do the following work:(1)Knowledge graph distribution representation learning based on the three-branch neural network:Based on the structure of triples,we propose a neural network topology consisting of three parallel branch-es,which corresponds to the three elements in each triple.Through the design of the connection mode between the branches and the learning of the connection weights,we obtain the interaction relationship between the entities and the relation of a triple,and solve the limitations of the existing methods in modeling the interaction relationship between the entities and the relation.Finally,we model the confidence scores of the triples based on the similarity of the outputs of the three branches.(2)Distributed represen-tation learning of knowledge graphs based on pseudo-siamese networks:There is correspondence between a triple and a question-answer pairs in factual simple question-answering.For example,the triple(China,capital,Beijing)can be regarded as the abstract of the factual question "Where is the capital of China?" and its answer " Beijing".That is to say,the com-bination of the head entity and the relation corresponds to the question,and the tail entity corresponds to the answer.Based on the above char-acteristics of triples,we split it into two parts(head entity,relation)and tail entity,transform the two parts into the same feature space through the pseudo-generating network,and calculate the similarity between the two parts in this feature space.In addition,by constructing inverse relations,new training samples such as“(tail entity,inverse relationship,head enti-ty)" are constructed,such that the number of training samples is expanded to improve the learning results of the model.Although the constraint characteristics within the triples can reflect the local structural characteristics of knowledge graphs,it dose not fully reflect the global structural characteristics of knowledge graphs.In a knowledge graph,two nodes can be connected not only by the direct relation,but also by multi-hop sequences,and these multi-hop sequences usually contain se-mantic information similar to the corresponding triple.To further learn the global structural characteristics of knowledge graphs,we apply the corre-lation between multi-hop sequences and triples to the distributed represen-tation learning of knowledge graphs.The specific work is as follows:(1)Graph embedding based on the generalization of recurrent neural network-s:From the angle of general graphs,we propose the concept of subgraph similarity to describe the similarity between multi-hop sequences and the corresponding triple in the same subgraph.We generalize recurrent neural networks to graphs to obtain the distributed representation of the multi-hop sequences and the triple,and then the subgraph similarity is modeled in the embedded vector space.Different from existing methods that only focus on relation sequences,this thesis considers a complete multi-hop se-quence containing both entities and relationships to avoid ambiguity caused by information missing.(2)Distributed representation learning of knowl-edge graphs based on subgraph-aware proximity:We further concerned the structure of the sequences of knowledge graphs based on the subgraph-aware proximity.Each multi-hop sequence of a knowledge graph is an al-ternating arrangement of entities and relations,which can be decomposed into an entity sub-sequence and a relation sub-sequence.To reflect the structural characteristics of multi-hop sequences,we propose a dilated re-current neural network which matches the special structure of multi-hop se-quences.In addition,considering the different correlation degree between a multi-hop sequence and different triples,we propose a sequence-level at-tention mechanism to learn the correlation weight between the multi-hop sequence and triples.Distributed representation of knowledge graphs can be used in intelli-gent search,recommendation and automatic question answering.To verify the performance of the proposed algorithms,we use the distributed repre-sentation of knowledge graphs obtained by each algorithm to perform link prediction and node classification tasks.The experimental results show that the proposed knowledge graph distributed representation algorithms can improve the performance of existing algorithms in different aspects.
Keywords/Search Tags:knowledge graph, distributed representation, embedding vector, pseudo-siamese network, recurrent neural networks
PDF Full Text Request
Related items