Font Size: a A A

Multi-classification Discriminant Analysis Of Serum Vis-NIR Spectroscopy For Breast Cancer Screening

Posted on:2023-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Q LiFull Text:PDF
GTID:2544307046993139Subject:Optical engineering
Abstract/Summary:PDF Full Text Request
Breast cancer is a serious and highly prevalent disease with a high recurrence rate after healing.It is very common in clinical examination.Clinically,it is usually necessary to distinguish three groups of breast cancer,benign breast disease and normal control.Existing clinical biopsy histopathology and other adjuvant non-invasive examination methods have the drawbacks of invasive,radioactive,time-consuming or expensive,which are not suitable for large population and frequent breast cancer screening.In recent years,related scholars have explored the feasibility of applying serum spectra to the binary classification discriminant analysis of breast cancer and normal control.Further,the three-classification discriminant analysis of breast cancer,benign breast disease and normal control is a more challenging topic that requires further in-depth research.It is of great significance for the early diagnosis of breast cancer and clear classification of screening population.So far,there are no relevant research reports.In this paper,visible and near-infrared(Vis-NIR)spectroscopy combined with chemometric methods were used to establish a serum-based model for pairwise discrimination and three-classification discrimination of breast cancer,benign breast disease,and normal control.The feasibility of breast cancer screening by reagent-free serum spectroscopy was explored,and related research on spectral analysis modeling strategies were carried out.Rigorous calibration-prediction-validation experimental system were established,and the evaluation index system of the two-classification and three-classification discriminant models were proposed.An integrated optimization method based on standard normal variable correction(SNV),equidistant combination-partial least squares discriminant analysis(EC-PLS-DA),wavelength step-by-step phase-out-partial least squares discriminant analysis(WSP-PLS-DA)was proposed.The two-classification discriminant models of breast cancer-normal control,benign breast disease-normal control,and breast cancer-benign breast disease were established respectively.Furthermore,a three-classification discrimination strategy based on comprehensive voting of pairwise discrimination was proposed,and a three-classification discrimination model of breast cancer-benign breast disease-normal control based on serum spectra was established.For comparisons,three-classification discriminant models were established by k-nearest neighbor(k NN),k-correlation coefficient(k CC)and bayes classification method combined with equidistant combination(EC)wavelength screening,which denoted as EC-k NN,EC-k CC and EC-Bayes,respectively.The relevant results of modeling and validating were as follows:(1)Pairwise discriminant analysis model:Using the EC-WSP-PLS-DA method,the optimal number of wavelengths(N)of the binary classification discriminant models for breast cancer-normal control,benign breast disease-normal control,and breast cancer-benign breast disease were 19,20,58.And the total modeling accuracy(RARTotal)were 98.0%,97.7%,and 89.3%,respectively.Among them,a model fusion method based on three-model joint voting was also established to improve the effect of the two-classification discriminant model of breast cancer-benign breast disease with poor initial modeling effect.Using independent validating,the overall accuracy of validating(RARV)for the three binary classification discriminant models were 84.5%,93.0%,and 76.0%,respectively.(2)Three-classification discriminant analysis model:Regarding the PLS-DA model,a three-classification discriminant strategy based on the comprehensive voting of pairwise discrimination was proposed,and a three-classification discriminant model based on breast cancer-benign breast disease-normal control was established.RARTotal,RARVwere 91.5%,74.7%.The RARTotal and RARVof EC-k NN,EC-k CC and EC-Bayes models were 68.7%,53.3%;73.0%,58.7%;57.3%,53.7%.respectively.Among them,the effect of the three-classification discriminant model based on PLS-DA was significantly better than other three-classification discriminant models.The results showed that among the three binary classification discriminant analyses,the discriminant model for breast cancer-normal control and benign breast disease-normal control performed better(RARV:84.5%,93.0%),which showed the feasibility of breast cancer screening based on serum Vis-NIR spectra.The accuracy of the three-classification discriminant analysis needs to be further improved.This study provides a new idea for breast cancer screening,and the establishment of wavelength models can provide a valuable reference for the design of related special spectrometers.
Keywords/Search Tags:Serum breast cancer screening, Visible-near-infrared Spectroscopy, Multiclass discriminant analysis, Partial least squares discriminant analysis, Equidistant combination, Wavelength step-by-step phase-out
PDF Full Text Request
Related items