Font Size: a A A

Research On Traditional Chinese Medicine Texts Relation Extraction Based On Weakly Supervised Deep Learning

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2404330614455383Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The Chinese medicine field has accumulated a large amount of ancient literature,including a large amount of Chinese medicine knowledge.In order to automatically obtain the information from a large amount of TCM literature,it is necessary to first extract information from TCM texts.Relation extraction is one of the basic tasks of information extraction.The supervised relation extraction methods require a large number of datasets with known labels.The weakly supervised relation extraction methods can automatically generate corpora using the weakly supervised learning method under the given triples and unlabeled TCM text,which can effectively alleviate the cost of manual annotation.However,under the condition of weakly supervised learning,the TCM text datasets are incorrectly labeled,which generate noise sentences and affect the effect of relation extraction.In response to these problems,the following research work have been done.Aiming at the problem of weakly labeling in TCM weakly supervised labeling data,which affected the effect of bag-level relation extraction,a weakly supervised deep learning model based on the dual attention mechanism was proposed.This model was based on the idea of multi-instance learning and classifies relationships on the basis of bags.Bidirectional long short term memory neural networks were used to bi-directionally encode the embedded vectors of TCM texts to capture the semantic features of each sentence.At the same time,the weights of irrelevant TCM vocabulary and noisy sentences were reduced by the wordlevel attention layer and the weakly supervised attention layer respectively,and the influence of noise on the relationship extraction effect was reduced.This model can mitigate the effects of noise and better predict the relationship for each packet.Comparing the model with the average attention layer,the experiment shows that the model can better extract the relationship information of the package at the weak supervision level,and obtain better relationship extraction results.Aiming at the problem of noise sentences in weakly supervised TCM text relation extraction,which caused the model to fail to accurately learn the entity relations in the sentence,a relation extraction model based on deep reinforcement learning method was designed.This model classified the relationship of each TCM sentence and consisted of a sentence selector and a relationship classifier.The sentence selector selected TCM sentences with high confidence and put them into the set,and the relationship classifier judged the relationship label of each sentence.The two models were jointly trained after a certain number of pre-trainings,and they complemented each other,optimized together.The experimental results show that the weakly supervised relation extraction model combined with deep reinforcement learning methods can achieve better relation extraction results at the level of sentences,and the sentence selector model can effectively select high quality sentences and deal with noise data.Figure 29;Table 8;Reference 54...
Keywords/Search Tags:TCM text, weakly supervised learning, relation extraction, deep learning
PDF Full Text Request
Related items