Font Size: a A A

Research On The Enhancement Method Of BERT Model Based On Knowledge Graph

Posted on:2022-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:J R LuFull Text:PDF
GTID:2518306782952679Subject:Journalism and Media
Abstract/Summary:PDF Full Text Request
In recent years,benefit from the powerful hardware computing performance and the rapid development of deep learning,natural language models represented by BERT have successively entered center of the stage.It achieved SOTA in natural languages processing test sets such as GLUE,SQu AD,and RACE scores.However,the BERT model,which has achieved excellent results in the general field,is limited by the corpus size in the pre-training process and lacks of factual knowledge in the professional field during the training process,resulting in its performance being subject to different restrictions in the professional task.At present,some scholars have proposed that the knowledge-enhanced BERT model can improve the lack of knowledge of the model in downstream tasks in different fields by introducing external expertise.Since knowledge-enhanced BERT benefits from introducing large-scale external knowledge,the model's computing resource requirements and training time rise sharply.In addition,there are also problems of insufficient and incomplete learning of the external knowledge when injecting external knowledge into the BERT model to improve its knowledge cognition level.In this process,information noise affects the stability and generalization of the model.To solve the problem mentioned above,this thesis studies the knowledge enhancement methods of quickly introducing exogenous knowledge and the knowledge enhancement methods of enhanced the cognition and generalization ability of the model.First,this thesis studies the technical realization of knowledge enhancement of the BERT model by using knowledge graph technology in academia and proposes a knowledge enhancement method named Corpus Associate Generation.This method is reasoned through the formulation description,and the computational reasoning of the algorithm proves its advantage in time complexity.Then,set up experiments to test the accuracy and precision performance of the Corpus Associated Generation method,and compare it with other related model methods.Finally,the experimental results show that this method can effectively reduce the computational cost of introducing external knowledge to the BERT model,and reduce the computational time by 53.5% and 37.4% in the knowledge introduction and training stages respectively.Secondly,in order to further improve the ability of the BERT model to utilize exogenous knowledge and reduce the overfitting phenomenon in its training process,this thesis proposes to use the incremental learning task d EA and the gradient optimization algorithm Child Tuning F to modified the model.The d EA task uses the original BERT model task and structure to combine the knowledge entities provided by the external knowledge base to perform incremental learning to the model.Meanwhile,by modifying the optimizer to achieve the optimization operation of gradient calculation in the model training process,the optimization algorithm has proved in theoretical calculation that it can effectively improve the generalization ability of the model.Then,it is verified in the experiment that the knowledge enhancement method of joint incremental learning and gradient optimization can effectively further improve the accuracy performance of the model.In the experiment,the accuracy rate is increased by an average of 1.87% compared with BERT,and the overfitting problem is compromised in the small-scale data tasks.Finally,the core module of the intelligent question answering system is constructed by using the BERT model in the project instance,and the knowledge of the BERT model is enhanced by the extra knowledge provided by the knowledge graph and the aforementioned knowledge enhancement methods,to strengthen its knowledge cognitive ability in the professional field.Then,experiments were carried out in the professional field data set.The experiments proved that the BERT model with enhanced knowledge of the two methods improved the accuracy performance by 10.07% compared with the original version.Finally,the effectiveness of the intelligent question answering system was verified in the real scene.
Keywords/Search Tags:natural language processing, knowledge graph, knowledge enhancement, incremental learning, gradient optimization
PDF Full Text Request
Related items