| In the field of Traditional Chinese Medicine(TCM),the task of relation extraction is to extract the semantic relationship between entity pairs from TCM texts and express them in a way which is easy for people to understand.With the development of deep learning technology,it is widely used by researchers in relation extraction tasks because of its ability to automatically extract features.However,the application of this method that ignores the input of different language granularities has an important impact on the model,especially in the relation extraction of TCM texts with rigorous words.Aiming at the problem that the single granularity model cannot avoid the inability to effectively use the semantic information of the text due to the segmentation error and ambiguity of the statement,a multi-granularity information method is proposed to input the information of different granularities in the sentence into the deep learning model for training and relationship prediction,and the semantic features of the text are deeply mined.It provides more knowledge guidance for the model,so that the model has better robustness.The main research contents are as follows:1)Research on the processed Multi-granularity data.Analyzing the information characteristics of character granularity and word granularity of TCM text,the original data is obtained by utilizing crawler technology,and cleaned.The accuracy of word segmentation is improved by constructing a TCM-dictionary.The tool of Jieba word segmentation is used for segmenting the text with multi-granularity,and Word2 vec technology is used to quantify the sentence.The sentence set is annotated by remote supervision and the results are checked manually as input data of relation extraction model.2)A text relation extraction model of TCM based on multi-granularity information is constructed.The extraction of the relationship between the two entities in the TCM field is regarded as a classification problem.Two segmentation methods are adopted for TCM sentences,namely character granularity and word granularity,and the obtained texts with different granularities are vectorized.The model uses neural network based on grid structure to encode character granularity information and word granularity information,and learns more sentence features by fusing multi-granularity information to lower the probability of relationship prediction error caused by ’’ sentence meaning error transfer ’’ and ’’ text semantic loss ’’ due to entity segmentation error.3)Improving the input structure of model coding layer on the basis of the multigranularity information extraction model,and an improved multi-granularity information relation extraction model of TCM text is constructed.The model still uses the idea based on the grid structure,and the input of the model is also the character granularity and word granularity information.The training speed of the model is improved by changing the input structure of the model coding layer.The attention mechanism is used to assign different weights to the words of the same word set to obtain the final word vector,which changes the fusion method of character granularity information and word granularity information.The method enables the model to learn more comprehensive sentence features.The effectiveness of the method is verified through experimental comparison.Figure 31;Table 12;Reference 56... |