Font Size: a A A

Research On Entity Relation Extraction Based On Knowledge Distillation And Adversarial Training

Posted on:2022-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2518306563464164Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Entity relation extraction is a fundamental task in the field of natural language processing,which plays an important role in several rasks,such as knowledge graph,question answering system and information retrieval.When the pre-trained language models are introduced to encode the feature of entity relation extraction,the scale of large pre-trained language models causes enormous training cost,such as training time and GPUs.In order to solve this problem,the knowledge distillation method is introduced into our baseline model,which uses the output of the deep neural network model(teacher model)to guide the shallow neural network model(student model).Such method improves the performance of the student model while the training cost remains the same,which shows strong effectiveness when environmental resources are limited.On the other hand,in order to enhance the robustness of the model and improve the performance of entity relationship extraction,adversarial training is applied to the extraction model,and the adversarial perturbation is added to the embedding output of the teacher model.This thesis focuses on the problems of large resource consumption during training and weak robustness in the entity relation extraction model.The main contribution of this work is as follows:(1)Aiming at the problem of large training costs such as time and GPUs of the pre-trained language model,we propose a LSTM based entity relation extraction model CASREL for knowledge distilling(LSTMCas Kd).The model is designed based on the CASREL based on BERT(teacher model)and the CASRELLSTMmodel based on LSTM(student model).The knowledge distillation strategy is introduced into the basic model.LSTMCas Kd learn the rich semantic information from teacher model output probability,which improves the performance of student model.Experiments show that the F1 value of LSTMCas Kd model is 0.8%higher than that of CASRELLSTMmodel on Web NLG,and the precision is 2.7%higher than that of CASRELLSTMmodel on NYT.Compared with the existing models,the F1 value is better than other models.(2)Aiming at the robustness of entity relation extraction models is weak,we propose a entity relation extraction model CASREL base on BERT with adversarial training(BERTCas Adv).In order to study the influence of joint knowledge distillation of adversarial training,we further propose a model CASREL base on LSTM for both adversarial training and knowledge distilling(LSTMCas Adv Kd).Based on CASREL,BERTCas Adv introduces fast gradient method(FGM),an adversarial training method,which adds adversarial disturbance to the output of word embedding layer of BERT encoding to enhance the robustness of the model.LSTMCas Adv Kd model uses teacher model to distill the knowledge of student model added in adversarial training.Experimental results show that the BERTCas Adv model is better than CASREL and F1value increased by 0.3%on NYT and Web NLG,it compared with the existing models,the F1 value of NYT is better than other models.and the F1 value of LSTMCas Adv Kd is0.3%higher than LSTMCas Kd,and 0.5%higher on NYT.
Keywords/Search Tags:Entity Relation Extraction, Knowledge Distillation, Adversarial Training
PDF Full Text Request
Related items