Font Size: a A A

ChartIndex: A contextual approach to automated standards-based encoding of clinical documents

Posted on:2008-06-29Degree:Ph.DType:Dissertation
University:Stanford UniversityCandidate:Huang, YangFull Text:PDF
GTID:1448390005465266Subject:Engineering
Abstract/Summary:
Introduction. Structured and encoded clinical information is critical to implementing intelligent clinical applications and clinical research. However, large amounts of important clinical information are still in unstructured free-text clinical documents, difficult to retrieve and exchange, and need to be encoded into structured form. This dissertation presents new methods to improve automated encoding of narrative clinical documents using controlled terminologies, with a focus on improving encoding precision, and the corresponding findings.; Methods. I implemented a scalable mechanism to reliably convert semi-structured clinical documents into a standard document model, the HL7 Clinical Document Architecture (CDA), with document sections properly represented according to their canonical types. I also carried out a pilot study on a contextual encoding method, which leveraged the information on document section types.; As an approach to improving the encoding precision, I explored a new method of improving general-purpose medical text processing---parse each sentence using a high-performance statistical natural language parser augmented with a comprehensive biomedical lexicon.; To further improve encoding precision, I devised a novel hybrid approach to detecting negations in clinical documents. This approach first classified a sentence according to a syntactical negation categorization using regular expression matching; then it located negated phrases in parse trees using a grammatical approach.; Results. The pilot study on contextual indexing showed that significant improvements on indexing precision were achieved with limited negative impact on indexing recalls for most types of radiology reports and report sections. After augmenting the general-purpose statistical parser with a standard biomedical lexicon, the F-1 measure was improved from 86.7% to 92.8% for base noun phrases. The hybrid approach for negation detection achieved a sensitivity of 92.6% (95% CI 90.9-93.4%), a positive predictive value (PPV) of 98.6% (95% CI 96.9-99.4%) and a specificity of 99.8% (95% CI 99.7-99.9%).; Conclusions. The standards-based contextual indexing approach, together with the new approach of medical text processing and the new method of negation detection have been shown to be promising in the improvement of indexing precisions. The structural information in sentence parse trees enables precise information extractions such as detecting negated biomedical terms.
Keywords/Search Tags:Clinical documents, Approach, Information, Encoding, 95% CI, Contextual, Indexing, Precision
Related items