| According to cancer data released by the World Health Organization’s International Agency for research on cancer,China has become a veritable “large country for cancer ”.Among them,breast cancer is the “number one enemy” of global women’s health,and ovarian cancer is the top of gynecological cancer.Studies have proved that the earlier screening identifies the benign and malignant conditions of tumors,the greater the probability that the tumor is cured,so screening for benign and malignant tumors is of great practical significance and application value.This paper mainly uses penalized logistic regression to study the benign and malignant breast tumors,and group penalized logistic regression to study the benign and malignant ovarian tumors.For the prediction problem of benign and malignant breast tumors,this study selected 569 breast tumors sample datas from UCI site at the University of Wisconsin System.Ten indicators,such as the perimeter,radius,area,concave point and fractal dimension of the nucleus were used as predictor variables,and benign and malignant breast tumors as response variable to build four prediction models: logistic regression,LASSO penalized,ridge penalized and elastic network penalized logistic regression.75% of the training samples were used to learn the prediction model,25% of the test samples were used to test the prediction performance of the model,and compared different prediction accuracy and performance of the four methods.It is found that the prediction performance of penalized logistic regression model is better than logistic regression model.In particular,the prediction performance of the LASSO penalized logistic regression model is the best,with an accuracy up to 97.18%,sensitivity 94.29%and specificity 98.13%.Therefore,LASSO penalized logistic regression can effectively predict the benign and malignant conditions of breast tumors.For the prediction problem of benign and malignant ovarian tumors,this study selected 349 ovarian tumors samples data from the Third Affiliated Hospital of Suzhou University,and 46 explanatory variables including routine blood test,general chemical test and tumor markers were divided into 11 variable groups as predictor variables,and benign and malignant ovarian tumors as response variable to establish group LASSO/SCAD/MCP penalized logistic regression.70% of the training set data learning prediction model,and 30% of the test set data validate the predictive effect of the model.The optimal adjustment parameters were selected by 10 folds cross validation and obtain the group estimation and group variable selection using the group coordinate descent algorithm,and found that the tumor marker group was the key variable group for the prediction of ovarian tumors.Compared the confusion matrix,overall accuracy,and AUC values of three group penalized logistic regression,support vector machine and artificial neural network,and found that the group penalized logistic regression had better prediction accuracy,all the AUC values exceeds 0.8.In particular,the group MCP penalized logistic regression model with the highest prediction accuracy,precision and specificity of 93.33%,84.62% and 95%,respectively.Therefore,the group MCP penalized logistic regression model can be effectively applied to diagnose benign and malignant ovarian tumors.In summary,the prediction models of malignant and benign tumors based on penalized logistic regression and group penalized logistic regression has a good recognition effect,which opens a new idea for breast and ovarian cancer diagnosis and has important clinical significance. |