Font Size: a A A

Research On Knowledge-Enhanced Pre-trained Language Models

Posted on:2022-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z R CaiFull Text:PDF
GTID:2518306776992519Subject:Library Science and Digital Library
Abstract/Summary:PDF Full Text Request
The general-purpose deep pre-trained language model is trained on a large-scale unsupervised corpus with a well-designed self-supervised pre-training task.It only needs a simple fine-tuning on the downstream dataset to surpass the best performance of the previous model,bringing leapfrog development to the field of natural language processing.However,although the above models can perform well on many downstream tasks,their performance is not satisfactory on some domain-specific tasks and strong knowledge-driven tasks,and there is still a lot of room for improvement.With the recent development of large-scale knowledge graphs,some existing works propose to use external knowledge to enhance pre-trained language models.The research on knowledge-enhanced pre-trained language models is expected to approach human-level artificial intelligence,with high academic value and practical significance.However,the current work lacks the ability of leveraging heterogeneous multi-source knowledge graphs,and rarely pays attention to the structured information in the knowledge graphs.The utilization of the knowledge graphs is not enough.At the same time,from the perspective of the model input and utilization of external knowledge,there is no work to consider whether the model really understands the injected knowledge,which will make the injected knowledge unable to take effect as the way of our expectation,while degenerating the performances.We aim to construct more robust and efficient artificial intelligence,therefore,this paper is proposed to solve these problems and the main contributions of the paper including the following:1.Multi-source Knowledge Fusion based on Graph Neural Network.The focus of multi-source knowledge fusion is to effectively fuse and represent heterogeneous knowledge from multiple knowledge graphs.This paper proposes multisource knowledge fusion based on graph neural network.First,homogeneous graphs are constructed for each knowledge graph individually to represent themselves,and then graph fusion is performed to form a unified heterogeneous graph.Afterwards,the mixed graph attention mechanism is used to improve the representation of each node,and the high-quality whole graph representation is fused back into the pre-trained language model through a location-specific gating mechanism,reducing the introduction of knowledge noise.The whole module finally achieves efficient multi-source knowledge fusion and representation,which lays a perfect foundation for the subsequent models to use these multi-source heterogeneous knowledge for reasoning.2.Structured Information Utilization with Knowledge Context.The structured information in the knowledge graph can be roughly thought as the structural information around a target entity composed of its surrounding neighbors with relations.This paper proposes the concept of knowledge context,which utilizes structured information to enhance entity representation.A pre-training task based on entity-neighbor hybrid attention and knowledge context modeling is proposed to help transfer the entity representation information generated by the pre-trained language model into its surrounding neighbor entities,and in turn aggregate the representations of surrounding neighbor entities to the representation of the central target entity for facilitating the information communication between different entities through common neighbors,providing additional global knowledge context for poorly represented low-frequency entities.3.Enhancing Knowledge Understanding Based on Dual Mapping Pre-training.Current work on knowledge-enhanced pre-trained language models ignores whether models truly understand injected knowledge.This paper proposes a pre-training task of dual mapping to improve knowledge understanding.By training the model's ability to convert text to entity and vice versa,the model can convert the relevant entity mention into the corresponding entity in the knowledge embedding space,and then convert it back to the natural language text representation output after comprehensive reasoning.It helps models to better understand and utilize injected knowledge,and greatly improve the performance of the model on related tasks.The main contribution points of this paper are verified by a large number of corresponding experiments with follow-up analysis,which effectively prove the rationality and reliability of the proposed model and mechanisms.The contribution of this paper further promote the development of related research.
Keywords/Search Tags:Natural Language Processing, Pre-training, Language Modeling, Knowledge enhancement
PDF Full Text Request
Related items