Relation extraction is a core task of information extraction and a classical task in the field of natural language processing,aiming at extracting semantic relations between entity pairs from massive unstructured text data.Traditional supervised relation extraction require a large amount of manual annotation to obtain training data,which is time-consuming and wasteful of resources.The distance supervision method is based on the assumption that "all sentences containing the same entity pair reflect the relation of the entity pair in the knowledge base",which enables automatic annotation of the corpus and saves material resources,and has become a hot research topic in the current relation extraction task.However,the assumption of distance supervision enforces the consistency of entity relations and inevitably introduces a large amount of noisy data.The research of distance supervision relation extraction focuses on how to reduce the impact of noisy data.Although the application of deep and multiple example learning alleviates the effect of noise data to some extent,the current research methods of distance supervision relation extraction still have shortcomings: 1)The input vector does not adequately reflect the importance and semantic information of the entity words.2)The free label attention mechanism ignores the case where multiple relations exist for entity pairs.3)The lack of effective methods to filter the noise inside the sentence.To address the above problems,this paper proposes relevant solutions,and the main contributions and innovations are as follows.(1)The paper proposes entity enhanced of distance supervised relation extraction,and uses the gating mechanism to integrate entity information and location information into the word vector,so that the model highlights the importance of entity words,and then improves the model’s ability to express key information.And the superordinate word information and multi-headed self-attention mechanism are used to obtain entity word vectors and word vectors with richer and more accurate expressions.This method enriches the input representation of sentences,which in turn improves the performance of the relation extraction model.(2)We propose the distance supervised relation extraction method based on entity-constrained attention mechanism,using entity-constrained to reduce the influence of noise inside and outside the sentence,using transformation matrix to map entity information to relation space.In this paper,entity constraints are generated according to the relative location of entity words,which solves the problem of entity pairs containing multiple relations.Further,because of the different contribution of words inside the text to the relation extraction,this paper uses attention mechanism in the input and pooling layers for calculating the sentence feature vectors,so that the model can filter the noisy data inside the sentences.In addition,this paper optimizes the sentence-level attention mechanism in the feature selective layer to filter the larger noisy data directly and reduce the influence of this noisy data on the model results.In this paper,we have conducted a comprehensive and detailed experimental validation of the above research methods in public data sets.Compared with the baseline model,the method proposed in this paper achieves better performance results.Its AUC value can reach 0.43,which improves 10.9% compared with the baseline PCNN+ATT model.In addition,both the exhaustive ablation experiments and the example analysis verify that the method in this paper can effectively fuse entity information and improve the performance of relation extraction. |