Font Size: a A A

Research And Implementation Of Drugdrug Interaction Extraction System Based On Ensemble Learning

Posted on:2017-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:K BiFull Text:PDF
GTID:2308330485980610Subject:Agricultural informatization
Abstract/Summary:PDF Full Text Request
As we all know, a variety of drugs are often used together to treat the disease in the modern medical treatment, so that can treat a variety of symptoms. However, there are interactions between some of these drugs, these interactions will produce some side effects in most cases. Therefore, understanding the interactions between drugs can avoid adverse drug reactions or the failure of treatment, which is very important to patient safety and controlling the cost of health care. At present a large number of drug interactions which have been found hidden in the biomedical literature, but the explosive growth of biomedical literature has been far beyond the ability of scientists to obtain knowledge manually. Designing an automated system to extract drug-drug interactions from biomedical literature helps to reduce the time it takes for biomedical researchers to study the literature. In this study, we combined the method of ensemble learning to implement the drug-drug interactions extraction, the research content can be divided into the following two parts:(1) extract the drug-drug interactions. The study used datasets is provided by DDIExtraction 2013 challenge task. Firstly, we preprocess the datasets, including tokens, POS tagging and syntactic parsing, Then we divided the candidate drugs into five groups based on syntactic and generated the feature vectors based on the lexical features, phrase features, verb features, syntactic features and auxiliary features; Lastly, we used sorting algorithm to sort the features of feature vectors, and then the classifier used different number of features to study.In this study we respectively used the SVM method and a variety of ensemble learning methods as a classification method to construct the classifier. We used these classifiers to classify for the extracted candidate drug pairs and compared the performance of these methods. After the comparative analysis of the experiments, we could get the conclusion: When the five groups of drug candidates select the same number of features, the method of Random Forest can obtain optimal performance, the value of F is 77.9%; when each group of candidate drug pairs select a different number of features, the dynamic ensemble method can get optimal performance, the value of F is 84%.(2) implement the Drug-Drug interactions extraction system. In the study we realize the DDI extraction system based on the JUNG toolkit and the framework technology of Java. The system has three basic functions: First, the drug-drug interaction retrieval function. This function can retrieve all the drugs which have interaction with someone drug from the database; Second, the mining function of drug-drug interaction. This function can extract all the drug pairs which has interaction between the drugs of the drug pair from a text by the DDI extraction model; Three, visualization function. The function can display all the drug pairs which query from the database by a network graph.
Keywords/Search Tags:drug-drug interaction, feature vector, ensemble learning, visualization
PDF Full Text Request
Related items