Optimized decision fusion of heterogeneous data for breast cancer diagnosis

Posted on:2008-08-17

Degree:Ph.D

Type:Dissertation

University:Duke University

Candidate:Jesneck, Jonathan Lee

Full Text:PDF

GTID:1444390005969824

Subject:Engineering

Abstract/Summary:

As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. This dissertation presents a computer aid known as optimized decision fusion, and explores both its underlying theory and practical application.; The purpose of this work was (1) to present optimized decision fusion, a classification algorithm designed for noisy, heterogeneous data sets with few samples, and (2) to evaluate decision fusion's classification ability on clinical, heterogeneous breast cancer data sets. This study used the following three clinical data sets: heterogeneous breast mass lesions, heterogeneous breast microcalcification lesions, and breast blood serum protein levels. In addition to these clinical data sets, we also used various simulated data sets.; We used two variants of our decision fusion algorithm: (1) DF-A, which optimized the area (AUC) under the receiver operating characteristic (ROC) curve, and (2) DF-P, which optimized the high-sensitivity partial area (pAUC) under the curve. We compared decision fusion's classification performance to those of the following other classifiers: linear discriminant analysis, an artificial neural network, classical regression models (linear, logistic, and probit), Bayesian model averaging of these regression models, least angle regression, and a support vector machine.; The simulation studies showed that decision fusion is able to maintain high classification performance on data sets with many weak features and few samples, although performance was lowered by feature correlations. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC = 0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC = 0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC = 0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC = 0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04).; For the data set of blood serum proteins, there were no statistically significant differences among the classifiers for distinguishing normal tissue from malignant lesions (AUC = 0.79 to 0.84, p > 0.12), but decision fusion was able to achieve significantly higher specificity, 60%, at 90% sensitivity (p < 0.02). For the task of distinguishing benign from malignant lesions, all classifiers had very poor performance (AUC = 0.50 to 0.57), but decision fusion achieved the best performance at AUC = 0.64 (p < 0.05). The proteins were probably indicative of secondary effects, such as inflammatory response, rather than specific for cancer.; In conclusion, decision fusion directly optimized clinically significant performance measures such as AUC and pAUC, and sometimes outperformed other machine-learning techniques when applied to three different breast cancer data sets. By testing on a wide variety of simulated and clinical data sets, we show that decision fusion is robust to noisy data and can handle heterogeneous data structures when given relatively few observations.

Keywords/Search Tags:

Decision fusion, Data, Breast cancer, AUC

Related items

1	Research And Application Of Medical Image Assisted Decision Based On Information Fusion
2	Research On Sleep Staging Technology Based On Multi-sensor Data Fusion
3	Research On Fatigue Decision Based On Multi-dimension Data Fusion In Human Computer Interaction
4	Research On Breast Cancer Survival Prediction Based On Deep Learning And Omics Data Fusion
5	Research On Breast Cancer Subtypes Clustering Model Based On Multi-omics Data Fusion
6	Research On Multi-source Information Joint Decision Algorithm In Assistant Diagnosis Of Depression
7	Analysis And Design Of Breast Cancer Classification System With Multimodal Features
8	Surgery Decision Support Intervention For Early Breast Cancer Patients
9	Applicability Study Of Artificial Intelligence Clinical Decision Support System In Postoperative Adjuvant Treatment Of Stage ?-? Breast Cancer
10	Application Of Decision Tree Model In Predicting 5-year Survival Of Breast Cancer