Font Size: a A A

Research On Aided Diagnosis Method Of Lung Cancer Based On Medical Image And Genetic Data

Posted on:2021-10-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y DongFull Text:PDF
GTID:1484306542973569Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Lung cancer has the highest morbidity and mortality among malignant tumors.Intelligent diagnosis of lung cancer through analysis of image and gene data is an essential means to improve patients' survival rate.In medical imaging,the main manifestations of early lung cancer are various types of lung nodules.Among them,difficult lung nodules suspected of lung cancer are difficult to diagnose accurately,and ground glass nodules are representative of difficult lung nodules.The thesis analyzes PET/CT images to study the auxiliary diagnosis method of ground glass nodules.The subtypes classification and staging of lung cancer are of great significance to personalized treatment.Based on an in-depth analysis of genetic data,innovative machine learning algorithms are proposed for the subtype classification and staging of lung cancer.In addition,in view of the relatively unexplored relation between images and genes in the current lung cancer research,we carry out the correlation analysis of lung cancer CT images and key pathogenic genes,and explore the potential of using CT images to predict mutations in key pathogenic genes.Such techniques can achieve a non-invasive prediction of gene mutations.In the thesis,a series of algorithms and models are proposed through the research analysis of image and genetic data of lung cancer and the existing key technologies for auxiliary diagnosis.The main research contributions are as follows:(1)Ground glass nodules(GGN)are difficult to segment accurately due to complex morphological characteristics.Moreover,GGNs are associated with a higher malignancy probability.We propose an improved super voxel 3D region growing algorithm based on PET/CT for the segmentation GGNs.First,the seed points in the CT image are automatically located according to the PET image.Then we construct the 3D mask as the constraint for region growing and construct the fuzzy connectivity map as a measure of connectivity between super voxels.Finally,3D region growing is performed on the fuzzy connectivity map with super-voxel as the basic unit to complete the nodules segmentation.There is no need to set seed points and thresholds in the entire region growing process,which avoids the instability of the segmentation results caused by selecting seed points and thresholds.Experimental results show that the proposed method can obtain higher accuracy GGN segmentation results.(2)Given the problem of small sample size,high dimensionality,and unbalanced categories of lung cancer gene data,it is still challenging to accurately classify lung cancer subtypes using traditional machine learning methods.We propose a multi-weighted deep forest model(MLW-gcForest)based on methylation to classify lung adenocarcinoma subtypes.The proposed MLWgcForest model improves the standard gcForest model mainly in two aspects:(i)according to the difference of random forest classification ability,assign different weights to diverse random forests;make full use of the differences between random forests mutual synergy.(ii)We propose a sorting optimization algorithm that assigns different weights to the feature vectors generated under different sliding windows and makes full use of the feature vectors' complementarity under different sliding windows.The multi-level weighting strategy proposed in the method can help random forests extract richer multi-level features,thereby effectively improving the standard gcForest model's ability to classify small samples and high-dimensional genetic data.The experimental results show the proposed MLW-gcForest algorithm's excellent performance in the classification of lung adenocarcinoma subtypes.(3)Due to lung adenocarcinoma's complex pathogenesis,it is difficult to obtain satisfactory staging results using only a single genetic data type.We propose IMLW-gcForest based on multi-omics genetic data(gene expression,methylation,and copy number variation)to the lung cancer staging model.For the collected lung adenocarcinoma samples in three stages,we modify the method of assigning different random forest weights in the proposed MLW-gcForest model according to the hypervolume under multi-flow.Then we use multi-omics genetic data as input to train three IMLW-gcForest models separately and make full use of the complementarity between multi-omics genetic data.Finally,the three trained IMLW-gcForest models are fused for decision making to achieve accurate staging of lung adenocarcinoma.Experimental results show that the IMLW-gcForest model based on multi-omics genetic data significantly improves lung adenocarcinoma staging accuracy.(4)Targeted therapy is one of the main treatments for advanced lung cancer.Given invasive,time-consumption,costly and difficult to monitor continuously during gene mutation detection,we propose a multi-channel and multi-task deep learning model(MMDL)and use CT images of non-small cell lung cancer to predict the mutations of essential pathogenic genes EGFR and KRAS.(i)Decompose each 3D lung nodule into nine views and use the complementarity of multiple views to comprehensively characterize the nodules' characteristics.(ii)Construct the pre-trained Inception-attention-esnet model for each view to learn the features of the nodules.(iii)Use multi-channel learning to simultaneously predict EGFR and KRAS mutations so that multiple prediction tasks promote each other.Besides,embed the patients' medical record information into the model,and add more prior knowledge to mutations to improve the model's overall prediction performance.Finally,an adaptive weighting scheme is adopted to perform the model's decision-level fusion to obtain the final prediction result.The experimental results demonstrate that the proposed MMDL model outperforms the latest methods in predicting EGFR and KRAS mutations in NSCLC.In summary,we focus on the four problems of lung cancer auxiliary diagnosis and construct related algorithms and models.The research work has certain theoretical significance and clinical application prospect.
Keywords/Search Tags:Machine learning, 3D Region Growing, Subtype Classification, MLW-gcForest, Deep Learning, Gene Mutation Prediction
PDF Full Text Request
Related items