Font Size: a A A

Research On Embedding Methods For Knowledge Graph Completion

Posted on:2022-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:X H ZhaoFull Text:PDF
GTID:2518306326971559Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Knowledge graph is a semantic network structure composed of entities and their relationships,which describes various facts in the real world in the form of triples.It has become an important resource for artificial intelligence applications.However,the existing knowledge graph is often incomplete and lack of knowledge.Therefore,many researches aim to complete the knowledge graph,they are to infer new fact triples based on the existing triples and add them to the knowledge graph.The knowledge graph embedding technology is one of the best methods to complete the knowledge graph,and Transseries translation models represented by the TransE model have strong generalization ability and can complete knowledge graph in link prediction and triple classification tasks.The Transseries models map the entities and relationships in the factual triples to the semantic space and represent them as low-dimensional dense vectors.However,there are still some problems in the process of entity modeling and negative triples generation,such as inaccuracy of entity embedding vector and relationship embedding vector,inadequate semantic representation and poor quality of negative triples etc.Based on the above analysis,the main research contents of this paper are as follows:(1)In order to solve the problem that part of the training results are not consistent with the optimization goal when the TransC model is embedded into the conceptual entity,a knowledge graph embedding method(TransIC)based on information content(IC)is proposed.Based on TransC,the conceptual entity was modeled as a sphere by using IC,and the radius of the sphere was obtained by IC calculation model.It effectively improves the accuracy of the training results and deeply excavates the semantic information content of the concept.(2)Transseries models need to generate negative triples in the training process,and the quality of negative triples will greatly affect the embedding effect of the model and the ability to capture entity features.In order to generate high quality negative triples,ICNS(Information Content Negative Sampling)method is proposed.The semantic similarity between the replaced entity and the replacement entity is calculated based on the information content IC.And the quality of the negative triples is improved by setting continuous segmentation thresholds according to the semantic similarity value,which effectively solved the problem of invalid training caused by the lowquality negative triples.(3)The ICNS sampling method is combined with TransE and TransH models respectively to obtain TransE-ICNS and TransH-ICNS models.All negative triples were generated by ICNS.TransC-ICNS model is obtained by combining ICNS method with TransC,which is used to deal with the negative sampling of sub Class Of triples.And the negative sampling method of instance Of and relational triples are not changed.Comparison experiments are conducted on public datasets respectively.And the experimental results show that the proposed methods TransIC,TransE-ICNS,TransHICNS and TransC-ICNS all achieve better performance than other models in the two knowledge graph completion tasks of link prediction and triple classification.In particular,both the TransH-ICNS and the TransC-ICNS models perform significantly on all indicators of link prediction.
Keywords/Search Tags:Knowledge Graph completion, Knowledge Graph embedding, Information content, Link prediction, Triple classification
PDF Full Text Request
Related items