Font Size: a A A

Research And Development Of A Machine Learning Based Model For Non-invasive Prenatal Screening Of Maternal Malignancy

Posted on:2021-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y YuanFull Text:PDF
GTID:2404330611465918Subject:Biological engineering
Abstract/Summary:PDF Full Text Request
Although non-invasive prenatal testing(NIPT)based on high-throughput sequencing technology is very mature,there were still false negative and false positive cases through the study of clinical big data.Although the accuracy of NIPT is about 99.99%,the positive predictive value(PPV)of T13,T18 and T21 is only 12%~62%,47%~85% and 65%~94% respectively.The positive sample judged by NIPT needs to do invasive prenatal diagnosis in further,which will have a 1% risk of miscarriage.In the process of analyzing the causes,the researchers found that maternal malignancy was one of the causes of false positives.The method of screening maternal malignancy based on NIPT in previous studies is that the result of prenatal diagnosis is different from that of NIPT in which multiple chromosomal aneuploidies are promoted,and then the result is finally confirmed by tumor markers,videography or pathological sections,which will be used to explain the difference.However,most of these studies were only cases studies and didn't develop a systematic method for maternal malignancy screening.Based on 600,864 clinical samples of NIPT with return results from 2015 to 2018,the method for maternal malignancy screening is studied,in which the fetal fractions of autosomes are finally selected as features of the study through comparing the stability of T values and fetal fractions of autosomes.In order to reduce the redundancy and weight difference between features,the input data is preprocessed by principal component analysis(PCA)whitening and zero-phase component analysis(ZCA)whitening.In order to reduce calculation time and improve computing efficiency,the function Standard Scaler of PCA in scikit-learn is also used to preprocess features.Based on these key technologies and support vector machine(SVM),we develop a method of maternal malignancy screening.The research results are as follows:1.Firstly,the unsupervised anomaly detection algorithm is selected as optional detection algorithm and then the One-Class SVM is chose as the screening method of the study based on the applicability comparison of four anomaly detection algorithms,such as Robust covariance,One-Class SVM,Isolation Forest and Local Outlier Factor.Secondly,the parameters are trained based on cross-validation,i.e.602,186 non-tumor samples returned,including 600,235 negative samples and 1,951 samples with high fetal fraction of multiple chromosomes,is randomly assigned to training set(481,749 samples),validation set(60,218 samples)and test set(60,219 samples)according to 8:1:1 and 159 tumor samples returned is randomly assigned as validation set(79 samples)and test set(80 samples).Based on the training set,the One-Class SVM model is trained and the parameter of Gaussian kernel function and fault tolerant rate nu were adjusted using Gaussian function and grid search.The optimal values of and nu are determined as 0.007335354540793596 and 0.0012244251272095876.2.The performance of the model is evaluated based on the accuracy of test set,area under curve(AUC),sensitivity and specificity.The test set includes 60,142 clinical samples with returned results that have 60,299 effective data,which includes 24 positive samples that have 80 effective data and 60,118 negative samples that have 60,219 effective data.The test results are analyzed according to test data,sample and sample combined with tumor markers: 1)According to the analysis of effective data,the sensitivity is 83.750%,PPV is 80.723%,specificity is 99.973% and accuracy is 99.952%,which shows that the One-Class SVM can be used for maternal malignancy screening;2)According to analysis of samples,the sensitivity is 79.167%,PPV is 61.290%,specificity is 99.980% and accuracy is 99.972%,which shows that the OneClass SVM has higher sensitivity and specificity;3)According to the analysis of sample combined with tumor markers,the sensitivity is 60.870%,PPV is 100.000%,specificity is 100.000% and accuracy is 99.985%,which shows that the One-Class SVM combined with tumor marker has better performance.In summary,the study developed a method of maternal malignancy screening based on SVM,and in the same time,the positive samples tested by the method are further detected by tumor markers,which can improve the PPV of the method.The screening result not only can provide guideline for the decision of clinical doctor,but also can be compatible with NIPT without increasing the cost of experiment and test.In short,the method developed in the study can be applicable to clinical.
Keywords/Search Tags:Non-invasive prenatal genetic testing, Maternal malignancy, Machine learning
PDF Full Text Request
Related items