Font Size: a A A

Research On Knowledge Representation Model Lightweight And Training Sample Optimization Method

Posted on:2024-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:H T XuFull Text:PDF
GTID:2568306932462974Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Knowledge graph(KG)is an important supporting technology in the field of artificial intelligence,including knowledge representation learning(KRL).Faced with the demand for large-scale storage and computing power of current artificial intelligence models,lightweight models become an urgent problem to be solved.The development of artificial intelligence models needs a large number of high-quality data as support,and how to efficiently use samples for model training has become crucial.Our paper studies the lightweight model and training sample optimization in KRL and finds some shortcomings.First,to improve model performance,the current knowledge graph embedding(KGE)increases the vectorization dimension of entities and relations,which leads to extra computing resources and training time consumption.Second,the current KGE is trained through positive and negative samples,which inevitably introduces the false negative problem of negative samples.The model regards the knowledge triples that do not exist in the KG as negative samples.The main contributions of our paper are as follows:Firstly,in terms of a lightweight model,a KGE based on self-knowledge distillation(SKD)is proposed.The model performance is improved by introducing SKD.SKD avoids the introduction of a complex teacher model because it uses itself to guide its training.Without changing the structure of the model network,the implementation of SKD only needs to increase a small number of computing resources,the number of model parameters does not increase,and the training time of the model is shortened.Model performance can be maintained even when the number of model parameters is reduced.Our paper also optimizes the training process of the model,introduces knowledge adjustment and dynamic temperature distillation.Secondly,in terms of training sample optimization,a KGE based on positiveunlabeled(PU)risk estimation is proposed.By introducing PU risk estimation into KGE,this model successfully avoids the false negative problem during training.The model also introduces the paired ranking mechanism into the risk estimation calculation,realizing the utilization of useful information in unlabeled samples and improving the performance of the model.The synthetic sample strategy is also introduced in this paper.
Keywords/Search Tags:knowledge representation learning, link prediction, self-knowledge distillation, positive-unlabeled risk estimation
PDF Full Text Request
Related items