| In the process of high-dimensional data analysis,variable selection is a critical step: eliminating key variables will lead to unsatisfactory results;keeping too many irrelevant variables will not only slow down the regression efficiency,but also increase the regression error of the model.The development of SSVS(Spike-and-Slab variable selection)theory has enabled the use of Bayesian methods in solving high-dimensional problems with both accurate and efficient results.Compared to the commonly used variable selection methods,Bayesian variable selection imposes a larger penalty on the excluded variables and a smaller penalty on the selected variables.However,research on Bayesian variable selection seems to be more on linear regression and dichotomous logistic regression,and less on multi-categorical problems.In this paper,we apply SSLasso GLMS(Spike-and-Slab Lasso GLMs)theory to MNL model and MOL model,construc t a Bayesian variable selection framework based on Spike-and-Slab prior distribution,implement SSLasso-MNL model and SSLasso-MOL model by EM algorithm and coordinate descent method,and extend Bayesian variable selection to high-dimensional disordered and high-dimensional ordered models.The regression results of the simulated data showed that the SSLasso-MNL model outperformed Lasso in variable selection.92.8% of the 10-fold cross-validation predictions were obtained by analyzing the genetic data of cancer patients,and 75.0% of the 10-fold cross-validation predictions were obtained by analyzing the main influencing factors of occupational choice.The SSLasso-MOL model was also superior to Lasso in terms of variable selection and was able to screen out explanatory variables that were not consistent with the proportional odds assumption.Applying the model to the Alzheimer’s disease severity main effect gene study,the prediction accuracy of the leave-one-out method was 54.8%;then performing the life satisfaction analysis of the CFPS questionnaire,the accuracy of the ten-fold crossover prediction was 61.0%.The SSLasso-MNL model can obtain high prediction accuracy and the model is simple to set up and easy to implement;the regression process of the SSLasso-MOL model is tedious,but if desired results can be obtained,its coefficients will also have stronger explanatory power and be more meaningful in the empirical analysis.The paper concludes with a synthesis and analysis of the advantages and disadvantages of the two models and reasonable suggestions for subsequent research. |