Font Size: a A A

Establishment Of Risk Factors And Risk Assessment And Prediction Model For Female Breast Cancer In Eastern Henan Province Based On Artificial Neural Network Model

Posted on:2023-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiFull Text:PDF
GTID:2544306791450834Subject:Social Medicine and Health Management
Abstract/Summary:
Background Worldwide,breast cancer is the most common malignancy in women,with more than 26 million cases expected by 2030.The prevalence and mortality of breast cancer are expected to continue to increase over the next 10 to 20 years.This not only brings great pressure to medical resources,but also brings heavy economic burden of disease to the whole society.Breast cancer is preventable and treatable.There are risk assessment and prediction models for risk factors affecting the incidence of breast cancer at home and abroad,but there are differences in lifestyle,dietary habits,personal conditions and other aspects among people in different regions.In addition,there are different methodologies,and a unified model cannot be used to predict the incidence risk of breast cancer.Therefore,a risk assessment and prediction model suitable for this region should be established.Objectives Based on the epidemiological analysis of social,psychological and environmental risk factors,this paper adopted case-control study to screen the potential risk factors affecting the incidence of breast cancer,and established the index system of female breast cancer risk assessment and prediction model in eastern Henan province.Then,artificial neural network and traditional Logistic regression are used to establish prediction models and compare and verify them.Finally,through comprehensive analysis and evaluation,the model with better prediction effect was selected to provide support for women in this area to further reduce the incidence and maintain health by screening disease trend prediction among high-risk groups for primary prevention of breast cancer.Methods 1.Based on a case-control study design,data of breast cases were collected from the department of mammary gland in two grade-A grade iii hospitals in eastern Henan province from October 2020 to April 2021 and from May to July 2021(respectively used for screening influencing factors and model verification):female patients with primary breast cancer diagnosed by clinicopathology were selected as cases,and normal women without history and symptoms of breast cancer and other malignant tumors were collected as controls in the physical examination department,gastroenterology department and other wards of the same hospital according to the age matching principle.A unified questionnaire was completed in the form of face-to-face interview,including demographic characteristics,marriage and childbearing history,disease history,psychology and emotion,diet and lifestyle,etc.2.Using the data collected in the first stage,IBM SPSS Statistics 25.0 statistical software was used to conduct single-factor analysis on the independent variables that may affect the incidence of breast cancer,and then the above statistically significant variables were incorporated into the multi-factor Logistic regression for overall analysis(LR backward exclusion method,inclusion criterion(49)= 0.05,exclusion criterion α = 0.10),the risk factors that may affect women’s breast cancer in the future were obtained,and the index system for establishing the model was combined with literature research,expert advice and clinical significance.3.The data collected in the two stages were collated and combined,and randomly divided into modeling population and verification population in a ratio of 3:1.Firstly,artificial neural network and Logistic regression were used to establish the risk assessment and prediction models of breast cancer.Then,Roc curves of breast cancer subjects based on the two models were drawn respectively,the two models were evaluated by using two indicators of differentiation degree and calibration degree,and the prediction effects of the two models were compared.Results 1.Data collection: In the first stage,151 cases and 302 controls were collected,and in the second stage,85 cases and 216 controls were collected.A total of 754 samples were collected in the two stages.All samples were randomly divided into the modeling population(565 cases)and the validation population(159 cases)in a ratio of 3:1.2.Establishment of prediction model index system: Combined with statistical results,literature review,expert advice and clinical significance,10 indexes were determined as the input variables of the model:Age of first delivery(X1),number of abortions(X2),irritable temper(X3),depression(X4),staying up late(X5),indoor smoking(X6),benign breast diseases(X7),occupational exposure to harmful substances(X8),intake of seafood(X9),intake of coarse grains(X10).3.Model building and validation: modeling population,ANN model AUC was 0.9365,sensitivity at the best cut point 96.21%,specificity 85.79%,and Yordon index 0.82;ANN model H-L goodness of fit test:X2=4.9139,(49)=0.7667;Logistic regression model was: P=1/(1+exp[-0.2524+0.823X1+0.3X2 + 0.3X5 + 0.146X4 + 0.177X7 + 0.129X6 + 0.108X3-0.009X8-0.337X10-0.337X9]),the AUC can reach 0.9101,the sensitivity corresponding to 89.64% at the best cut point,the H-L goodness-of-fit test of the logistic regression model:X2=-4.6837,(49)>0.1.The prediction effect of ANN is significantly better than the prediction effect of the logistic regression model.In the validation population,the AUC of the ANN model was 0.8356,with 79.17% sensitivity,82.05% specificity,and 0.6121 Jorden index corresponding to the best cut point;the AUC of the logistic regression prediction model was 0.8757,with 76.39% sensitivity,86.32% specificity,and 0.6271 Jorden index corresponding to the best cut point.The H-L goodness-of-fit test for the model:X2=0.1174,(49)>0.1,H-L goodness of fit test results of logistic regression model:X2=8.4764,(49)>0.1.In the validation population,the sensitivity of ANN was higher than that of the logistic regression prediction model.Conclusions 1.Age at first delivery,total number of abortions,history of benign breast disease,depression,bad temper,passive smoking(indoor smoking),staying up late,occupational exposure to toxic and harmful substances,intake of coarse grains,and intake of seafood suggested risk factors for breast cancer in women in eastern Henan province.2.Compared with traditional Logistic regression method,the prediction model of breast cancer incidence risk for women in eastern Henan province established by artificial neural network(ANN)is more reliable and stable.
Keywords/Search Tags:breast cancer, risk factors, risk assessment and prediction, artificial neural network model
Related items