Font Size: a A A

Study For Model Selection Based On Blocked 3×2 Cross-Validated T Test

Posted on:2016-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q LiuFull Text:PDF
GTID:2180330482950873Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In traditional model selection, the choice of the optimal model is always based on the maximum or minimum performance indexes. However, in practical application, the optimal model always has high complexity, lacking of stable generalization ability. The reason may be that the difference of the performance indexes is not statistically significant when choosing the model, which may result from random error. Therefore, this paper assumes that the complexity of all the candidate models are known, a new model selection algorithm is proposed based on blocked 3x2 cross-validated significance t test for the difference of the performance indexes. Through comparing the difference of the performance indexes between two models, the model which has no statistically significant difference of the performance indexes and the small complexity is chosen. Simulation experiments show that, in many cases, this method has better generalization ability.Further, because the basis of blocked 3×2 cross-validated t test model selection method is blocked 3×2 cross-validated model selection method, this paper proves that blocked 3x2 cross-validated model selection method has consistency in selection in the classification prob-lem of model selection task. And simulation experiments show that the choice of the optimal model based on mean and vote are equivalent.Next, in the model selection task of regression and classification, blocked 3x2 cross-validated t test model selection method is compared with AIC, BIC, SRM, MDL, bootstrap, 5 fold cross-validation and blocked 3x2 cross-validation based on the mean square error cri-terion. The experiment results show that blocked 3x2 cross-validated t test model selection method always select the simple model based on the minimization of square loss (regression) and 0-1 loss (classification), and in many cases, the simple model has smaller mean square error.The innovation of this paper is that, the complexity of the model is introduced in model selection and treated as an index of model selection, it points out that when the performance indexes have no statistically significant difference, smaller complexity model is chosen. And this paper proposes model selection algorithm based on blocked 3×2 cross-validated t test and proves superior quality via simulation experiment.
Keywords/Search Tags:Model selection, Blocked 3×2 cross-validation, t test, Consistency in selection
PDF Full Text Request
Related items