Font Size: a A A

Enhanced Drug-drug Interaction Extraction Model From Biomedical Text Using BioBERT

Posted on:2022-06-17Degree:MasterType:Thesis
Institution:UniversityCandidate:Aya Mohamed Abdelaty ElkasedFull Text:PDF
GTID:2504306572965349Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Currently,most people especially elder people take two or more medications simultaneously due to the increase of common and uncommon diseases.These medications could cause major health problems and deadly side effects.This is known as Drug-Drug interaction(DDI)and may cause Adverse Drug Reaction(ADR).DDI extraction and detection can be considered an extremely important field of research for patients,medical staff,and health care management.Therefore,specialists started to group drug information in datasets to facilitate the extraction of different drug mentions and other chemical substances,effects,and relations.Natural language Processing(NLP)techniques have been released to improve the DDI extraction from the medical text using text mining tools.Due to the low performance results of early rule-based,machine learning(ML),and standard deep learning(DL)methods.Scholars tended to use the promising pre-trained models named BERT and BioBERT to overcome these limitations.We have proposed an efficient DDI extraction model based on the BioBERT pre-trained model.In addition,we provide two different approaches to extract drug names from medical text.One of them is using BERT,while the other is utilizing new python library named “spaCy”.The first part of the thesis focuses on the first task of the DDI,which is the Drug Named Entity Recognition(DNER).In this task,firstly,we proposed a DNER method using spaCy.This model achieved acceptable results of 78.69% and 79.81 in terms of recall and f-score.After that,we provide an approach based on the BERT pre-trained model.Evaluation of this model showed that it achieved a relatively high recall,accuracy,and f-score.The experimental results demonstrated that the proposed BERT-based approach results hit 87.3% and 87.9% in terms of recall and f-score respectively that outperforms other approaches.The comparison between both DNER models proved that the BERT-based model outperforms model using spaCy library in terms of accuracy,recall and F-score.In the second part of the thesis,we propose an outperforming DDI extraction model based on BioBERT.The proposed DDI extraction model consists of two parts.The first part focuses on the text-based relation classification using the BioBERT pretrained model where the second part addresses the chemical structure representation of the drugs using variational autoencoders.We investigate how much performance increase may be obtained by boosting the chemical structure information during the empirical analysis of the BioBERT-based DDI model.The experiments demonstrated that the BioBERT DDI based on the “biobert-v.1 pub Med” provides the best performance in text-based relation classification mode.Finally,the proposed DDI model provided better accuracy of 90.92 % and F-score of 82.23%.Which surpassed the CNN-based model by 12.48%,outperformed the STM-based model by 12.84%,and surpassed the RNN-based model by 8.73% in terms of F-Score.
Keywords/Search Tags:DNER, DDI Extraction, BERT, BioBERT, Relation Extraction
PDF Full Text Request
Related items