Entity relation extraction is one of the important tasks in the field of natural language processing.Distant supervision learning uses heuristic matching between the relational triples of the existing knowledge base and the natural language text corpus to obtain data,which solves the problem that the supervised learning method depends on manual labeling data.However,there will be a large number of noisy relation labels.At present,Piecewise Convolutional Neural Network(PCNN)and Transformer structure are often used to extract sentence semantic features in distant supervision relation extraction tasks.Compared with Transformer structure with redundancy,convolutional neural network structure can improve training speed and save training cost by reducing the amount of parameters.Since the eigenvalues obtained from the convolution kernel are calculated instead of the original values,the amount of calculation is greatly reduced.However,there are still some deficiencies in the process of capturing global features and pooled compression features,which limits the semantic learning ability of the model.Therefore,this paper studies these problems in distant supervised relation extraction.The main work of this paper focuses on the following two aspects:(1)A research method of entity relation extraction based on self-attention mechanism is proposed.The convolution feature extractor used in traditional research only focuses on the local features in the convolution kernel window,lacks the ability to capture the global features,and there are a large number of noisy relation labels in the distant supervision data.Self-attention mechanism extends the self-attention to the whole sentence,and makes full use of the global information characteristics of the sentence corpus.When the self-attention mechanism captures the global relation between words,it can be calculated without relation label information.Therefore,this paper uses the self-attention mechanism to train the weight for the corpus.Using multiinstance learning and intra-bag attention mechanism,the sentences containing the same entity pair are divided into a bag,and the attention is calculated between the sentences in the bag,so that the weight of the sentences with noisy relation label is lower.By learning the sentence embedding representation and bag embedding representation that can better express semantic relations,we classify the relations at the bag level.Experiment shows that the precision of the proposed method is 6.41% higher than that of the current model,which can effectively improve the effect of entity relation extraction model..(2)A research method of entity relation extraction based on feature fusion is proposed.During the pooling process of convolution feature extractor,the feature compression is easy to cause the loss of corpus information.Moreover,the relation classifier is trained according to the noisy relation labels in distant supervision,which is difficult to learn the semantic features of sentences reasonably.Therefore,on the basis of using self-attention to learn the embedding representation of corpus sentences,this paper uses the external knowledge base to supplement the head and tail entity features of corpus sentences for feature fusion,so as to make effective use of the potential semantics contained in the head and tail entities.Using the relation feature of knowledge base,the distance measure is defined to help measure the training error and reduce the impact of classification training directly using noisy relation labels.Experiment shows that compared with the original model,the precision of the proposed model is improved by 1.93%.After using the self-attention mechanism to improve the feature extraction method,the introduction of knowledge base entity relation feature can more effectively deal with the problems existing in the current research.Adopting the method of feature fusion to further transform the model can better improve the accuracy of entity relation extraction and the performance of the model. |