With the development of deep learning technology,relation extraction based on deep learning is widely studied,and it can be divided into supervised relation extraction and distantly supervised relation extraction.Supervised relation extraction model has higher precision and more reliable results,but it depends heavily on manually labeled data,which leads to a lot of consume of resources.The proposal of distantly supervised relation extraction effectively addresses the above problems.Distant supervision can quickly obtain a large amount of labeled data,and relieve the pressure of manually constructing relation extraction datasets due to the lack of labeled data.However,the assumption of distant supervision is too strong,resulting in incorrectly labeled samples in the training data,and the resulting noise affects the performance of the relation extraction model.In response to the noise problem in distant supervision datasets,scholars have proposed effective methods such as multi-instance learning,attention mechanism and so on,however,the attention mechanism only considers the importance of different sentences in multi-instance learning without considering the different contributions to the sentence of different words in sentences.In addition,in deep learning model,the quality of the word vector directly affects the overall performance of the model.Currently,in many distantly supervised relation extraction models,novel word vectors are constructed to enhance the learning ability of the models,thereby improving the performance of the models.In conclusion,a distantly supervised relation extraction model based on entity knowledge and structured self-attention mechanism is proposed in this paper,in which the new word vector is constructed to provide the model richer information and self-attention mechanism is combined to reduce the model noise.Firstly,a distantly supervised relation extraction model is proposed in this paper,in which embedding vectors of words,entities,entity descriptions and relative positions are concatenated as the input,a piecewise convolutional neural network is adopted as the sentence encoder,and attention mechanism is combined.Multiple elements are integrated to construct word vector at the input end of the model,which provides the model with richer information,and enhances the learning ability of the model.Secondly,on the basis of the above model,the sentence encoding layer is improved by introducing an improved structured self-attention mechanism between the convolutional layer and the piecewise max pooling layer.The improved structured selfattention mechanism can not only consider the different contributions of the words in the sentence to assign more proper weights to the keywords,but also can remain the position information and context information of the entities,which can further optimize the abovementioned relation extraction model.Finally,experiments are conducted on the NYT dataset and GDS dataset.The Precision-Recall Curve and the precision rate of the top N predictions P@N are adopted as the criteria for evaluating the performance of the relation extraction models.Compared with other baseline models,the proposed theory in this paper can further improve the performance of the distantly supervised relation extraction models. |