| Benefiting from the introduction of self-supervised methods that do not require manual labeling and the generation and popularization of large-scale corpora in recent years,the pre-trained language model that requires only a small amount of data to finetune on specific tasks can achieve good results.Once proposed,it has achieved great success in various fields.It has made remarkable achievements,among which BERT has refreshed 11 records in the NLP field in one fell swoop.However,researchers have gradually found that only based on purely pre-trained models,there is still a gap between the effects expected by humans in the face of some complex application scenarios(such as commonsense reasoning,domain adaptation,and knowledge-driven tasks).How to use external knowledge to assist the neural network model to better understand the input text is a question worth thinking and exploring.The knowledge graph,as a persistently stored knowledge base,contains rich structured knowledge that the model urgently needs,and can be injected into the pre-training model as effective external knowledge.However,there are still some unresolved problems in the traditional knowledge enhancement model.For example,when the knowledge graph is introduced,the knowledge processing is not sufficient,and only part of the information of the entity itself is considered;when performing knowledge fusion,the language model and knowledge graph represent words in two completely different vector spaces,which will face the problem of heterogeneous information fusion.In addition,BERT and Ro BERTa,which are often used as backbone models,limit their maximum input length to 512,and the excess part is truncated,and the insufficient part is filled.This is unreasonable in many NLP scenarios.For example,in long texts such as newspapers and periodicals,the input length will easily exceed this limit value.Taking this approach will lose a lot of semantic information.At the same time,the pre-trained language model does not make full use of the information contained in the input text itself.The general practice is to acquire the semantic understanding of each word through the attention mechanism,ignoring that the dependency relationship between words after dependency parsing can also be a kind of external knowledge beneficial to understanding the input.Therefore,in this paper,we conducted the following experimental exploration for the problems existing in the above scenarios:1)Based on the ERNIE pre-trained language model,further integrate the entity description information in the Wiki5 m knowledge graph and the KELM-corpus text information generated by the entire Wikidata triplet as external knowledge to enhance the ability of the model to learn semantic representation,which not only increases the injected knowledge but also alleviates the problem of heterogeneous information fusion caused by the inconsistency of the semantic space dimensions between the knowledge graph and the natural language text.2)Using Stanford Core NLP,LTP and other natural language processing toolkits as well as rule-based dependency parsing event element extraction method,the event elements contained in the input text are extracted and integrated into the pre-trained model as additional external knowledge to alleviate the problem of information loss caused by the maximum input length limit.3)GCN network is used to aggregate the dependency syntax graph generated by the input text,analyze the dependency relationship between words,and then BERT model is integrated to fully mine and utilize the knowledge of the input text itself,so as to obtain the semantically enhanced word embedding representation.The experimental results show that our method has improved compared with the benchmark model on Chinese and English datasets such as Few Rel,TACRED,IFLYTEK,and Co LA. |