Font Size: a A A

Research On Knowledge Graph Representation Learning Based On Low-dimensional Vector Space

Posted on:2023-11-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:K WangFull Text:PDF
GTID:1528307031978189Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Knowledge Graph Representation Learning has drawn great attention in the Artificial Intelligence(AI)and Knowledge Graph(KG)domains.It aims to represent entities and relations in a KG as low-dimensional real-number embedding vectors via a Knowledge Graph Embedding(KGE)model.By utilizing vector computation of entities and relations,link prediction based on KGE models can predict the missing element in a KG triple,and have significant potential for automatically KG completion and KG reasoning.Effective knowledge graph representation learning can act as an effective data channel between the discrete knowledge graph and deep neural networks,and greatly enhance the application value of KGs in various AI tasks,laying the foundation for the next technological progress of cognitive intelligence and even general artificial intelligence.To improve the predicting accuracy of KGE models,recent research tends to propose high-dimensional big models.These big models usually use a high-dimensional vector with hundreds or even thousands of dimensions to represent each entity,and achieve a little accuracy improvement on the benchmark dataset.However,these KGE models require numerous training and storing costs when facing large-scale KGs with millions or billions of entities.This prevents downstream AI applications from promptly updating KG embeddings or being deployed on resource-limited edge devices,and limits the research progress of knowledge graph representation learning.To this end,this thesis focuses on the study of knowledge graph representation learning methods under the condition of low-dimensional vector space.Under the premise of avoiding parameter explosion and meeting the needs of practical applications,this thesis analyzes the influencing factors of model cost and model accuracy,optimizes the key technical components of the KGE model,breaks through the bottleneck of model performance in low-dimensional vector space,and realizes a lightweight,high-accuracy,and low-cost knowledge graph representation learning solution.The research content of this thesis contains the following three parts:1.KGE enhancement framework based on multi-source information integration.First,for the unbalanced distribution problem of entity external information,this thesis proposes a KGE enhancement framework Co NE based on composite neighbors,and integrates entity features in textual descriptions and topological neighbors by constructing composite neighbor information.An encoder model based on deep memory networks is designed to encode composite neighbor information to enhance the entity embedding vector of the KGE model.Secondly,considering the insufficient performance of low-dimensional vector space and the high training cost of the high-dimensional teacher model,this thesis utilizes the knowledge distillation method to enhance the training label sequence of the KGE model.By designing a multi-teacher active distillation framework,the prediction results of multiple pre-trained low-dimensional models are integrated to provide effective supervision information for the student model.The experimental results show that our method can effectively improve the prediction accuracy and training speed of the low-dimensional KGE model.2.Efficient KGE model based on low-dimensional Euclidean vector space.First,considering the high computational complexity of the existing hyperbolic geometric models,this thesis proposes two Euclidean-based lightweight knowledge graph embedding models.The Rot L model simplifies the hyperbolic operations while retaining the flexible normalization effect of the hyperbolic model.The Rot2 L model adopts a double-stacked "rotation-translation" transformation module,which improves the model performance while keeping a low computational complexity.Secondly,considering the insufficient prediction accuracy of existing models and the difficulty in evaluating triple confidence,this thesis proposes a new confidence measurement method based on causal intervention theory,called Neighborhood Intervention Consistency.Firstly,the dimensional value of the input entity vector is actively intervened,and multiple neighborhood intervention vectors of the input entity are constructed.Then,prediction confidence is inferred by evaluating the robustness of the prediction results of the KGE model before and after the intervention process.3.KGE training strategy based on contrastive learning insights.First,this thesis deeply analyzes the relationship between knowledge representation learning and self-supervised contrastive learning.Based on the latest analysis results in the field of contrastive learning,a new KGE training strategy Ha LE is proposed.Aiming at the problems of long training time and unstable training gradient caused by the existing negative sampling loss function,this thesis designs a new loss function based on query sampling,which can efficiently achieve two important training goals,feature alignment of positive samples and entity distribution uniformity.Second,this thesis analyzes the hardness-aware ability of nonlinear functions in low-dimensional hyperbolic models,and thus proposes a lightweight hardness-aware activation mechanism that can help KGE models focus on difficult instances and speed up convergence.The experimental results show that the model trained by the Ha LE strategy can obtain high prediction accuracy after a short training time,and achieve performance close to the existing state-of-the-art model under low-dimensional and high-dimensional conditions.
Keywords/Search Tags:Knowledge Graph Representation Learning, Knowledge Graph Embedding, Knowledge Graph, Link Prediction
PDF Full Text Request
Related items