Font Size: a A A

CRFs-based Medical Text And Image Labeling Model Construction And Applications

Posted on:2016-02-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y YanFull Text:PDF
GTID:1228330467995511Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The computer-assisted medical information processing is an interdisciplinary researchinvolving both computer science and medicine. High precision text and image labeling is atechnical problem that can constrain the development of medical information processing.One of the key techniques to effectively solve the problem is using spatial contextual relationand semantic information of the text and the images. The Conditional Random Fields (CRFs)is the probabilistic framework for labeling and classifying structured data, and it has uniqueadvantages in expressing large-scale image spatial contextual relation informationandsemantic features, and modeling posterior probability. Based on the probabilistic framework,it can effectively reduce the uncertainty of text and image labeling, and therefore improvethe precision.According to the above advantages of CRFs theory, this thesis launches the study onconditional random fieldsfor modeling contextual characteristics and semantic features,aiming to solve the existing problems in identifying medical record text and labeling medicalimages. The main research content and innovation are as follows:1. Based on the characteristics of Chinese-language medical record, this thesis presentsthe method of naming entity identification for Chinese-language medical record based oncascadeconditional random field model. The firstlevel of the cascadeCRFs model identifiesthe two types of simple naming entity of basic body areas or composition, and basic diseases.The identifying result will be conveyed to a higher level cascade model. Meanwhile aself-defined combined feature by this thesis is added. In this way the input variables of thehigher cascade model not only includes the observation number, but also the identifyingresult from the lower cascade model. Therefore it provides the strategic support foridentifying complicated disease names and clinical symptoms. By testing the real corpus in aclosed and open way, compared with the a cascadeCRFs model without a self-definedcombined feature, the F number of model in this thesis is increased by3%; Compared with asingle-cascade CRFs model, the F number is increased by7%. The whole performance isobviously improved. In addition, from the experiment we can see that the entities whichdon’t appear in the training corpus are also accurately identified in the test, which effectivelysolves the problem of nesting structure and ambiguity for name entity in Chinese-languagemedical record. 2. According to the complicated osteosarcoma MRI images, this thesis proposes amethod of classifying and identifying osteosarcoma MRI images based on CRFs model. Apixel in the osteosarcoma MRI image belongs to some type of probability, which is not onlyrelated to its own characteristics, but also the information distribution of its surroundingpixels. As a result when modeling various types (muscle, bone, fat, tumor, etc.) of targetingcharacteristics of osteosarcoma MRI image, it is also necessary to model inter-constrainingrelations among various types, using Joint-boost method to train noted samples. The testresult shows that Joint-boost method is superior to any other method in identifying andsplitting various types of targets of osteosarcoma MRI, especially in identifying and splittingtumors with disordered structure, changeable shapes, less samples, and low accuracy in othercalculating methods, which provides high-degree reproducibility and reliability for clinicaldecisions.3. Considering the high complexity in time for marking image model of medical image,we propose the method of labeling medical images of region-based conditional randomfields. First the image is split into smaller and well-distributed areas. And then each area istaken as a node, the size of the image is reduced. The neighboring nodes in space areconnected in their sides to make an image model, and CRFs definition is given, realizing theevaluation and derivation of model parameter. The experiment result shows that the CRFsmodel, based on areas, has achieved a better classifying result, meanwhile greatly shortenedthe operating time and promoted efficiency.The textual medical records, diagnosis reports, medical images and other data used inthis thesis are all provided by XX Tumor Hospital and The XXHospital of XX University.Each group of data have been consulted and reviewed by medical experts before being used.
Keywords/Search Tags:medical naming entity recognition, medical image labeling, conditionalrandom fields, computer-assisted therapy
PDF Full Text Request
Related items