Font Size: a A A

Biomedical literature mining for pharmacokinetics numerical parameter collection

Posted on:2014-11-15Degree:Ph.DType:Thesis
University:Indiana UniversityCandidate:Wang, ZhipingFull Text:PDF
GTID:2458390005494365Subject:Computer Science
Abstract/Summary:
Model-based drug studies have been developing very fast recently. They require high quality pharmacokinetics (PK) parameter numerical data. However, most parameter measurements are still buried in the scientific literature. Traditional manual data extraction is too expensive to handle the exponentially growing number of publications. This thesis focuses on the application of text mining (TM) and machine learning (ML) for drug pharmacokinetics parameter data collection from the published literature. First, we explore the feasibility of TM on the extraction of drug PK parameter data from PubMed abstracts. Our method achieves higher precision and obtains rich information content. For the test drug Midazolam, it extracts 10 times more PK clearance data than the manually constructed commercial Drug Interaction Database (DiDB). Similar performance is obtained on additional test drugs. Following the success of TM on abstracts; we extended the methodology to full text articles and developed a literature mining pipeline for PK parameter data extraction, which is the first working approach to extract numerical data from full text articles, capable of processing both plain text and tabular data. The specific contributions of this thesis include: 1) A new PK ontology for entity template construction; 2) Comparison of NLP and machine learning algorithms for PK information retrieval; 3) Tabular data extraction; 4) PK information extraction from full text literature; 5) Multivariate nonlinear mixed model for PK parameter transformation.
Keywords/Search Tags:Parameter, Data, Literature, Numerical, Pharmacokinetics, Full text, Drug, Mining
Related items