Font Size: a A A

Drug Name Recognition And Drug-Drug Interaction Extraction Based On Machine Learning

Posted on:2014-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:L N HeFull Text:PDF
GTID:2248330398450536Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of knowledge, The pharmaceutical industry is increasingly becoming a knowledge-based discipline. In the process of developing drugs, Scientists need to access relevant information and knowledge. At present, the research about drugs has been grown explosively, and various kinds of new drugs emerged endlessly which has overwhelmed most health care professionals. Although some of the information is stored in a structured form, a great deal of information is unstructured and written in natural language. The direct purpose of biomedical named entity reconition is to find the name of specified types from unstructed information, it is the main basic for natural language processing technology, such as information extraction, machine translation, information retrieval, question answering and so on. Drug name recognition is to find drug names from unstructured information, and we can research if there is relation between two drug names for the next step. As is known to all, many drugs are taken together in chinese prescriptions. Though it could treat kinds of symptom, it will alslo produce side effect. If two drugs taken together will produce side effects, the doctors should not prescribe them together. At present, the research about drug-drug interaction extraction has became more and more.This thesis firstly introduces the technology and current status about named entity recognition, some state-of-the-art methods and the existed problems. And raises a combined method which combines supervised learning method and semi-supervised learning method to drug name recognition. Recently, there are more researches focused on biomedical named entity recognition but most of them are about protein names and gene names. Drug name recognition is a new reseach. Another reaserach that extract interaction between two drug names is carried out after identified the drug names in the biomedical texts. In the next section, the research state and some methods of interaction extraction are introduced. This thesis proposed a method which combined three machine learning method, namely, feature-based kernel, graph kernel and tree kernel to extract drug-drug interaction. This method introduces some domain knowledge which drived from database DRUGBANK. In our researches, we only judge if there are interactions between two drugs but not classify the interactions.In drug name recognition, firstly, a dictionary of drug names is constructed with the external resource of DRUGBANK and PUBMED. Then a semi-supervised learning method, Feature Coupling Generalization (FCG), is used to filter this dictionary. Finally, the dictionary and the Condition Random Field method are combined to recognize specific drug names in biomedical texts. Experimental results show that our method achieves an F-score of92.54%on test data of DDIExtraction2011task. In drug-drug interaction extraction, a weighted multiple kernel learning-based approach is presented for automatic DDI extraction. The approach combines the following kernels:feature-based, graph and tree kernels, reducing the risk of missing important features. In addition, some features employing domain knowledge are introduced into the feature-based kernel which contribute to the performance improvement of feature-based kernel. More specifically, a weighted linear combination of individual kernels is used. The weight of each kernel is assigned according to their performance, thus allowing the introduction of each kernel to incrementally contribute to the performance improvement. The experimental results show that the combined kernel can achieve a better performance of69.18%in F-score than those of other systems in the DDIExtraction2011challenge task.
Keywords/Search Tags:Drug Name Recognition, Drug-Drug Interaction Extraction, SupervisedLearning, Semi-supervised Learning, Feature Coupling Generalization
PDF Full Text Request
Related items