| With the advent of the big data era,various forms of data show an explosive growth trend.As the main record carrier of information,text data contains rich knowledge resources.Relation extraction is an effective method to organize unstructured text data,which has attracted extensive attention in recent years.The proposal of distant supervision hypothesis solves the dependence of traditional supervised methods on datasets,but its overly positive assumption ignores the diversity of human language,resulting in a large number of noisy data in the annotation results.Therefore,this paper is committed to avoiding the adverse impact of noisy data on the model,and also makes exploration in mining more fine-grained text features.It mainly includes the following three research contents:(1)From the perspective of attention mechanism,this paper integrates the entity vector containing relationship features obtained by Trans H training,and denoises from both sentence feature and instance aspect.The first layer of attention mechanism improves the quality of sentence coding by jointly selecting the three pieces of sentence feature extracted by convolutional neural network and the entity vector trained by Trans H,assigning higher weights to the features that reflect the relationship information.The second layer is to select effective instances by another attention calculation at the sentence level and further reduce the weight of noisy data.The model achieves better results than baseline models on the widely used classical datasets.(2)In order to fully mine the fine-grained features of different dimensions of text,this paper proposes a Chinese graph convolution model based on BERT.Firstly,the token-level pretrained model BERT is used to obtain the rich contextual semantic features of sentences.With the aid of the toolkit,the syntax structure of sentences is analyzed,and redundant nodes are filtered based on the path-centric pruning strategy.Together with semantic features,they are input into the graph convolution network for feature fusion.In addition,the model still combines an attention mechanism to assign weights to the sentence instances in the bag,mitigating the effect of wrong labels caused by distant supervision hypothesis.Several groups of comparative experiments are conducted on the independently constructed CN-DBpedia datasets,and the model shows better experimental results.(3)Based on the proposed Chinese relation extraction algorithm,this paper constructs a relation extraction visualization system for educational resources.At the same time,in order to ensure the high reliability of the results,the system integrates a quality evaluation algorithm to support the quality evaluation of the extracted results.In practical application,users can obtain the relationship between entities in the input sentence in a fast and intuitive way and can monitor the operation results of the algorithm in the process of quality evaluation in real time. |