Font Size: a A A

Text Mining-based Research On Adverse Drug Reaction

Posted on:2021-09-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H LiFull Text:PDF
GTID:1484306302961459Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Adverse Drug Reactions(ADR)are known as the harmful reactions or injuries that are caused by the intake of drugs and are critical to ensure safety to patients' health.Since traditional post-market ADR monitoring methods suffer from under-reporting,incomplete data,and delays in reporting,many potentially harmful drugs remain unflagged.With the emergence of a vast number of biomedical texts and social media posts,using text mining technology to automatically and accurately obtain information about ADRs from natural language texts will greatly promote the research in the biomedical domain.In recent years,deep learning methods based on neural networks have been widely used in speech,image and text processing,and have obtained breakthroughs.Therefore,this dissertation focuses on the technologies of biomedical text mining based on deep learning,and studies on three tasks:ADR detection,ADR mention recognition,and ADR relation extraction.And an ADR information extraction system,i.e.ADRExtractor,is built according to the methods proposed in this dissertation.Given an input text,ADRExtractor first identifies the texts related to ADRs from massive unstructured texts,then recognizes ADR entities from the texts,and finally extracts the relations between drugs and their ADRs.For the ADR detection task,an adversarial transfer learning method is proposed to address the problem of the limited size of the training set and improve the performance on the task.In this method,the ADR detection problem is formulated as a sentence classification task and a transfer learning method is exploited to train the shared module with a large source corpus and a small target corpus.In addition,during the training process,an adversarial learning method is introduced to prevent the specific features of different corpora being introduced into the shared space.Experimental results show that the proposed method can capture the shared features between the source and target corpora and improve the performance on ADR detection.For the ADR mention recognition task,an interaction graph network is proposed to accurately identify mention boundaries.In this method,three word-phrase interaction graphs are designed to represent the boundary and contextual information of candidate phrases,and graph attention networks are adopted to encode the graphs.Besides the mentions in a lexicon,our method takes noun phrases in a sentence into consideration so that it can recognize more out-of-lexicon mentions.Experimental results show that these graphs capture information independently and yet complement each other nicely.And compared with other state-of-the-art methods,the proposed method achieves better performance and can effectively identify the boundaries of ADR mentions.Moreover,it can recognize out-of-lexicon mentions and achieves higher recall.For the ADR relation extraction task,a shortest dependency path(SDP)-based neural network method is proposed to effectively process long sentences with many entities.Besides the original word sequence in a sentence,the method also takes the SDP and the dependency relation types of the sentence as input,so that it can accurately capture the syntactic information between the candidate entity pair.Experimental results show that introducing SDP and dependency relation types improves the performance of sentence-level relation extraction,especially for the relations with limited training instances.Moreover,for the document-level relation extraction task,a sequence labeling-based method is proposed to extract intra-and inter-sentential relations.Different from traditional classification-based methods,the proposed method regards the document-level relation extraction problem as a sequence labeling task and takes an abstract as input,so that the interaction between different ADR mentions related with the same drug can be captured by the method.Experimental results show that the method effectively extracts the document-level relations,especially the inter-sentential ones.Finally,according to the three task and the proposed methods,an ADR information extraction system,i.e.ADRExtractor,is built.ADRExtractor can recognize the drug and disease entities in the input text,extract the relations between drugs and ADRs,and visualize the output results.
Keywords/Search Tags:Adverse Drug Reaction, Text Mining, Sentence Classification, Named Entity Recognition, Relation Extraction
PDF Full Text Request
Related items