Font Size: a A A

Research On Structure Variable Selection In Support Vector Machine

Posted on:2018-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:X H ChengFull Text:PDF
GTID:2310330515959996Subject:Statistics
Abstract/Summary:PDF Full Text Request
When applying support vector machine to high-dimension data classification,we of?ten add some penalty to the support vector machine to remove the irrelevant predictors.Lasso and other variable selection methods have been successfully applied to the support vector machine,which can perform variable selection automatically.But in many prac-tical problems,the simple linear additive model can not capture the relationship between the predictors and response variables,and the interaction of variables can increase the predictive power of the model.For example,in the diagnosis of the disease,two kinds of symptoms at the same time will help doctors make a more clear judgment;in the search for the cause,the interaction between genes and genes,genes and environmental factors are particularly important.In the presence of interactions,there exists natural hierarchical structure among variables.Lasso and other methods,however,do not respect the heredity principle,so the model is difficult to explain.Therefore,in this paper,we impose a sparse structure and structural constraints in the support vector machine simultaneously,so that the hyperplane follows the strong heredity,that is,an interaction term can be included in the model only if both of the corresponding main terms are also included in the model.Firstly,We rewrite the interaction coefficients into the product including the main coeff-i-cients,then the model itself enforces the heredity constraint.Secondly,while minimizing square hinge loss function,Lasso penalty is added.In order to further improve the accu-racy of prediction accuracy and the performance of variable selection,adaptive weights is introduced to apply different degrees of punishment on different coefficients.And the introduction of two tuning parameters makes the model have the flexibility to select vari-ables.If the coefficients of main terms are not equal to zero,the model has chances to compress the coefficient of the interaction to zero.Finally,we verify our model on simu-lated data to show the advantages of the model from that predictors are correlated or not,the interactions exist or not,the interaction effects are strong or not,the model satisfies the strong hierarchical constraint or not etc.Compared with L1-SVM,this model follows the heredity principle.It can not only improve the prediction accuracy,but also choose the relevant variables more accurately,and eliminate redundant variables efficiently,which also has certain advantages in the performance on real data.
Keywords/Search Tags:Support Vector Machine, Variable Selection, Strong Heredity Constraint
PDF Full Text Request
Related items