Font Size: a A A

Association Between High Risk HPV Infection Status And Risk Factors/biomarkers Of Cervical Cancer And The Screening Performance Based On Machine Learning

Posted on:2020-01-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z N WuFull Text:PDF
GTID:1364330578483801Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
ObjectivesThis study aimed to analyze the difference of risk factors/biomarkers of cervical cancer between high risk HPV single and multiple infections,and to explore the association between HPV type and cervical cancer.Based on the machine learning methods,the above indicators were used to construct comprehensive models of cervical cancer screening,and evaluated the screening performance.Materials and MethodsWe launched a multicenter,cross-sectional study that recruited women who participated in routine cervical cancer screening program or diagnosis as precancer and cancer by gynecologic clinic in 7 hospitals,from April 2014 to August 2017.One questionnaire of risk factor and two cervical exfoliated cell samples were collected.One sample was kept in a Dacron swab for HPV 16/18 E6 protein detection;the other sample was kept in ThinPrep solution for HPV DNA testing,p16/Ki-67 dual staining detection and for the liquid-based cytology assessment.Women with positive results for one of the four screening tests were recalled for colposcopy and biopsy.Cervical samples of women who attended gynecologic clinic were collected and measured by the four screening tests before treatment.The residue samples were stored at the ultralow temperature refrigerator until July 2016 to September 2017,and were used for high risk HPV DNA genotyping,high risk HPV E6/E7 mRNA and HPV16/18/45 E6/E7 mRNA testing.HPV DNA genotyping test targeted for 14 high risk HPV types.HPV 16,18,31,45,51 and 52 could be reported separately,and the remaining 8 types were reported as 3 groups(HPV33/58,HPV59/56/66 and HPV35/39/68).Based on the HPV genotyping results,HPV DNA positive samples were divided into 3 groups:1.Identified Single Infection,including samples with HPV16,18,45,31,51 and 52 single channel positive;2.Possible Single Infection,including samples with HPV33/58,HPV59/56/66 and HPV35/39/68 single channel positive;3.Multiple Infections,including samples with two or more than two channels positive.By using the risk factor questionnaire and the testing results,the expression of each biomarkers in different HPV infection statuses and their performance of cervical cancer screening were analyzed.Comprehensive cervical cancer screening models were constructed based on logistic regression,random forest and support vector machine methods.Results1.The proportions of Identified Single Infection,Possible Single Infection,and Multiple Infections were similar in normal/CIN1 women,33.7%,35.4%,and 30.9%,respectively.In CIN2/3 patients,the proportions of Identified Single Infection(46.8%)and Multiple Infections(37.2%)increased,and the proportion of Possible Single Infection decreased(16.1%).In SCC/ADC patients,the proportion of Identified Single Infection was significantly increased(70.3%),Possibly Single Infection decreased significantly(5.4%),while Multiple Infections decreased slightly(24.3%).The overall expression rate of HPV E6/E7 mRNA in multiple infections was higher than single infection(Identified Single Infection vs.Possible Single Infection vs.Multiple Infections:87.0%vs.76.2%vs.92.5%,?2=37.865,P=0.001).The positivity rates of nuclear p16 single staining,cytoplasmic p16 single staining,nuclear Ki-67 single staining,any p16 protein staining,any Ki-67 protein staining,and p16/Ki-67 dual staining were all slightly higher in Multiple Infections than in Identified and Possible Single Infection.The positivity rate of pl6/Ki-67 dual staining was the most significantly different among infection status(Identified Single Infection vs.Possible Single Infection vs.Multiple Infections:61.2%vs.38.7%vs 54.4%,?2=56.669,P<0.001).2.The positivity rates of HPV16/18/45 E6/E7 mRNA and HPV16/18 E6 oncoprotein varied in different HPV 16 and HPV 18 multiple infection clusters.For example,cluster that most likely to express HPV 16 E6/E7 mRNA and E6 oncoprotein was HPV16/52(mRNA vs.E6 oncoprotein:87.0%vs.71.0%),and the less was HPV16/45(mRNA vs.E6 oncoprotein:66.7%vs.14.3%).The positivity rates of E6/E7 mRNA were higher in multiple infections than in single infection in all grades of lesions;the positivity rates of E6 oncoprotein in single infection were comparable to multiple infections in normal and precancerous lesions,while the positivity rate of E6 oncoprotein was higher in single infection than in multiple infections in cancer patients.The positivity rate of HPV 16 E6/E7 mRNA(59.4%)was similar to HPV18/45 E6/E7 mRNA(62.5%)in HPV16/18 coinfection women,and the combined positive rate(100.0%)was close to HPV 16(98.7%)and HPV18(100.0%)single infection.The positivity rate of HPV16 E6 oncoprotein(46.4%)was similar to HPV 18 E6 oncoprotein(39.3%),and the combined positive rate(78.6%)was also close to HPV 16(76.4%)and HPV 18(74.4%)single infection.3.High risk HPV DNA,E6/E7 mRNA,p16/Ki-67 protein single or dual staining showed high screening sensitivity and low specificity;while HPV 16/18 DNA,HPV16/18/45 E6/E7 mRNA and HPV16/18 E6 oncoprotein showed high screening specificity and low sensitivity.The clinical performance of high risk HPV E6/E7 mRNA was the best in this study population(sensitivity:94.1%,specificity:76.4%),followed by p16/Ki-67 dual staining(sensitivity:85.1%,specificity:79.3%)and high risk HPV DNA(sensitivity:91.9%,specificity:66.8%).p16(sensitivity:92.2%,specificity:12.7%)or Ki-67(sensitivity:94.7%,specificity:10.6%)single staining alone was not effective in this population.4.Cervical cancer screening model based on 4 variables,i.e.high risk HPV E6/E7 mRNA,HPV 16/18 E6 oncoprotein,p16/Ki-67 dual staining,and age through logistic regression model(sensitivity:90.0%,specificity:93.2%),random forest model(sensitivity:91.5%,specificity:89.9%)and support vector machine model(sensitivity:92.0%,specificity:90.9%)achieved both high sensitivity and specificity in this population,and the screening performance were better than any of the biomarkers above used alone.Conclusions1.The risk factors and biomarkers of cervical cancer are different in HPV single and multiple infections.Cervical cancer could be caused by the E6-protein expressing genotype in multiple infections.2.Each biomarker of cervical cancer has its unique characteristic in screening performance,and it is difficult to achieve high sensitivity and specificity simultaneously.The cervical cancer screening model based on high risk HPV E6/E7 mRNA,HPV 16/18 E6 oncoprotein,p16/Ki-67 dual staining,and age yielded both high sensitivity and specificity.
Keywords/Search Tags:Cervical Cancer, Human Papillomavirus, Risk Factor, Screening, Machine Learning
PDF Full Text Request
Related items