Font Size: a A A

Information Resource Integration And Data Mining Of Adverse Drug Events

Posted on:2015-02-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:L W WangFull Text:PDF
GTID:1264330428483924Subject:Social Medicine and Health Management
Abstract/Summary:PDF Full Text Request
At present, drug-related adverse events is becoming a serious public healthproblem. Though rigorous experimental studies are conducted before the drugs are putinto markets, not all potential adverse reactions can be found. After the "Thalidomide"tragedies after the1960s, many countries introduced pharmacovigilance system forthe monitoring of marketed drugs. United States drug and Food Administration (Foodand Drug Administration,FDA) drug adverse event reporting systems (Adverse EventReporting System,AERS) is mainly used to find rare serious adverse events that werenot identified in clinical trials due to low frequencies or new adverse drug events,namely, safety signals. If potential security problems of drugs are found in the AERS,FDA will conduct an epidemiological study to further evaluate the adverse event todetermine the causal relationship between the drug and adverse events. Based onsafety assessment of drug adverse events, FDA may take a series of regulations toimprove product safety for the protection of public health, such as updating druginstruction information, making restrictions on the use of drugs, giving newsecurity-related information to the public, or in a few cases, withdrawing the drugfrom the market.Most research on ADE data mining focused on small-scale data, avoiding theresearch with large-scale data, unable to deeply mine ADE in terms of the mechanismof action, pharmocodynamics and physiological effect of drugs. However, suchlarge-scale and deep-level data mining are of great significance for revealing differentfeatures of ADE among different drug categories, the etiology of ADE, as well asgenetic aspects, and is the important direction for monitoring ADE and clinical drugsafety studies. Lacking knowledge integration of drug-related adverse eventsresources greatly limits the above mentioned studies.The knowledge integration of drug adverse events is not only the real demand forefficiently using a large amount of medical information resources, but also the keyissues which should be seriously studied and solved for promoting the efficiency ofdata mining of drug adverse events. In recent years though the development of drug ontology provided realization chance of information resource integration, but stillfailed to achieve the ideal solution for the knowledge integration and deepaggregation of data in drug adverse events due to the complex of drug ontologies, thelack of normalization of data and the technical difficulties of ontology mapping.Therefore, data mining of adverse drug events fail to expand to the utilization andanalysis of large scale data.Domain ontologies can provide knowledge for decision making and reasoningsupport, promoting large-scale drug safety signal detection and deep mining of ADE.This study used biomedical ontologies to integrate AERS-related informationresources, realizing knowledge integration, information aggregation, andinteroperability with other medical data resources, enriching resources for ADEmining and promoting drug safety signal detection.The main contents of this study include:(1)Proposing a theoretical model for mapping between multiple domainontologiesRealization of ontology mapping as well as drug classification and aggregationwill not only provide preconditions for drug-related knowledge decision-making andreasoning support, but also be the important foundation for building the knowledgebase in the field, bearing important significance the deep mining in terms ofmechanisms of action, pharmacokinetic and physiological effect of drugs. Due to thecomplexity of the domain ontology itself and between heterogeneous domainontologies, mapping methods for domain ontologies become one of difficulties inontology mapping. This study proposed a theoretical model for mapping betweenmultiple domain ontologies and aggregation. Guided by this model, a mappingexample between RxNorm and NDF-RT (The National Drug File-ReferenceTerminology) was conducted with a new approach for mapping, and classification andaggregation for drug information in RxNorm based on the classification mechanismprovided by NDF-RT was realized.Research results show that the model is not only feasible, but also with practicalvalue in terms of fully reusing multiple ontologies; the theoretical models will alsofurther deepen knowledge organization method of information resources at thesemantic level, and promote the construction of digital resource systems. Theinadequacies of the model include that the empirical use of theoretical models is based on existing ontologies, deficiencies in concepts and classifications may influence theresults of the classification and aggregation from ontology mapping.In addition, othercharacteristics of domain ontology may also be the factors for improving knowledgeorganization methods, hence, the future research should conduct more comprehensiveresearch on domain ontologies, extract more effective common features and promotethe perfection of the model.(2)Evaluation of RxNorm for Covering Drug Names inAERSThe investigation of AERS drug names covered by RxNorm is the first step tofully explore the way that RxNorm exerts effect in AERS data mining, and a crucialstep.Using the AERS “DRUG” data from the first quarter of2004through the end of2010, we calculated the coverage of AERS unique drug names and all drugoccurrences by RxNorm and UMLS with data mining techniques. Results showed thecoverage of AERS unique drug names by RxNorm and UMLS is respectively13,565(4.8%) and21,272(7.5%). Then we manually analyzed200AERS drug namesuncovered by RxNorm with frequency of more than1000and388samples withfrequency of less than1000to investigat the reasons of non-coverage and proposedsome ways for enhancing RxNorm. Although different sources including health careprofessionals and consumers as mentioned above contribute to the collection of AERSand their drug name entries may vary greatly even including typos, high-frequencydrug frequencies can still reflect clinical usage habit in specific domain. This studyprovides the foundation for improving RxNorm, also for choosing the naturallanguage processing tool MedEx (based on Rxnorm).(3)BuildingAERS-DMOn the basis of the AERS drug name normalization investigation, the druginformation in the AERS is normalized to RxNorm, a standard terminology source formedication, using a natural language processing (NLP) medication extraction tool,MedEx. Drug class information is then obtained from the National DrugFile-Reference Terminology (NDF-RT) using a greedy algorithm, with the theoreticalmodel for mapping between multiple domain ontologies and aggregation. Adverseevents are aggregated through mapping with the Preferred Term (PT) and SystemOrgan Class (SOC) codes of MedDRA. Finally our study yields an aggregated knowledge-enhanced AERS data mining set (AERS-DM). Case studies wereperformed to demonstrate the usefulness of our approaches.We have built an open-source Drug-ADE knowledge resource that is normalizedand aggregated using standard biomedical ontologies. The data resource could providemore perspectives to mine the AERS for ADE detection and be used by the datamining research community. Two tables are formed: one stores the normalizedDrug-ADE information and the other stores the aggregated information of Drug-ADE.The data in the two tables can be connected through the RxNorm codes. In total, theAERS-DM contains37,029,228Drug-ADE records. Seventy-one percent(10,221/14,490) of normalized drug concepts in the AERS were classified to9classesin NDF-RT. The number of unique pairs is4,639,613between RxNorm concepts andMedDRA PT codes and205,725between RxNorm concepts and SOC codes afterADE aggregation.(4)Empirical Study on Data Mining inAERS-DMAERS-DM is a normalized and aggregated data set for data mining in AERS,with the advantage of normalization and aggregation data for drug and ADE classes,which all come from the knowledge structure asserted in biomedical ontologies.Traditionally ADE detection studies with the AERS were carried out for only a smallnumber of drugs, and few studies were focused on large-scale mining[1]. In this studywe demonstrated the semantic mining potential in AERS-DM by using theinformation on popular cancer drug ingredients to conduct systematic analysis of drugclusters in terms of mechanism of action, physiologic effect, treatment intention andADEs, as well as ADE differences in terms of age and sex.Traditional ADE detection rely on the use of disproportionality measuresattempting to quantify the degree of “unexpectedness” of a drug-ADE association,and trying to overcome the disadvantage of lacking incidence information of ADEs inspontaneous reports including AERS. In this study we demonstrated a novel ADEdetection method where the incidence information of ADEs could be obtained throughconnecting AERS data with EHRs, realizing the comparative research on ADE oflarge-scale drugs. As an advanced version of AERS, AERS-DM may serve as anintriguingly substantial resource for data mining as shown in this study. Innovations in this study include:(1) Theoretical innovationThe theoretical model for mapping between multiple domain ontologies wasproposed. Currently each domain ontology has distinct features due to limitations inontology development. For example, some domain ontology provides classificationand aggregation information, and some with no such information, is complementaryto others in the coverage and contents. The theoretical model for mapping betweenmultiple domain ontologies proposed in this study fully uses different features ofdomain ontologies to realize the classification and aggregation function throughontology mapping, thus saving ontology development cost and realizing ontologyreuse.(2) Method innovation(i) Based on the theoretical model for mapping between multiple domainontologies, a systematic algorithm was developed realizing classification andaggregation of drugs with NDF-RT for RxNorm codes normalized in AERS. Themethod innovation was shown in two aspects:①The rich semantic connectionswithin RxNorm were fully utilized to infer the related concepts that can be used fordrug classification in NDF-RT.②both clinical drug names and generic drug nameswere used to find multi-axis classifications in NDF-RT, thus avoiding to missclassifications by using generic drug names only. Compared with other existingmethods, this method is suitable for more complex situations.(ii) Natural language processing methods and biomedical ontology were used forlarge-scale data normalization and information aggregation in AERS, making massivesignal detection of adverse drug events possible. Based on that, a novel ADE detectionmethod was proposed, where the incidence information of ADEs could be obtainedthrough connecting AERS data with EHRs, realizing the comparative research onADE of large-scale drugs.
Keywords/Search Tags:Drug Ontologies, Adverse Drug Events, Data Mining Mapping, Aggregation, Theoretical Model, Cancer Drug, Classification
PDF Full Text Request
Related items