Font Size: a A A

Research On Robust Feature Selection For High-dimensional Data

Posted on:2017-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:G M LanFull Text:PDF
GTID:2428330569499095Subject:Systems Science
Abstract/Summary:PDF Full Text Request
High dimension is one of the key characters of big data,which can prevent data mining methods from developing efficient models to process data.Meanwhile,the noise and outliers are irrelevant and redundant resulting in low efficiency.To work through challenges of high-dimensional data and deal with the effect of noise and outliers,we carry out a deep research on dimensionality reduction and robust learning,and then propose a new robust feature selection method.The contributions we make are in the following three aspects:?1?We have analyzed and researched the way for dimensionality reduction and the structure of robust learning.Researching on the methods for dimensionality reduction in data mining,we focus on embedded feature selection method.Meanwhile,we propose a robust structure to deal with noise and outliers.?2?We propose a new robust feature selection method with emphasizing Simulta-neous Capped?2-norm loss and?2,p-norm regularizer Minimization?SCM?.The capped?2-norm based loss function can effectively eliminate the influence of noise and outliers in regression and the?2,p-norm regularization is used to select features across data sets with joint sparsity.An efficient approach is introduced with proved convergence then.The parameter determination and analyze is put forward as well.?3?We use real world data sets to make a through inquiry on our method:An 2D toy example is raised to verify capped?2-norm is robust to noise and outliers.Extensive experimental studies on synthetic and real-world datasets demonstrate the effectiveness of our method in comparison with other popular feature selection methods.The conver-gence results are shown as well.And eventually,we researched the parameter behavior concluding that our parameter has influence on our method but also holds steady.
Keywords/Search Tags:feature selection, robust learning, regression analysis, sparsity regularization
PDF Full Text Request
Related items