Font Size: a A A

Research On Cost-sensitive Feature Selection Problem

Posted on:2021-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:C J AnFull Text:PDF
GTID:2518306020967049Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the increase of data dimension in many application fields,feature selection,as an essential step to avoid the curse of dimensionality and enhanced the generalization of the model,is attracting more and more research attention.However,most existing feature selection methods mainly focus on features' relevance with learning performance while neglecting the cost to obtain them.For example,in the process of medical diagnosis,each feature may have different testing cost.To select low-cost subsets of informative features and obtain a trade-off between learning performance and feature costs,we proposed two novel algorithms from two aspects.Firstly,to solve the cost sensitive feature selection problem on high dimensional datasets,we propose a stratified random forest-based cost-sensitive feature selection method.Unlikely commonly used two-step cost sensitive feature selection approaches,in our model,the cost of feature is incorporated into the construction process of the based decision tree,that is,the cost and the performance of each feature are optimized simultaneously.Moreover,we adopt a stratified sample method to enhance the performance of the feature subset for high-dimensional data.A series of experimental results validate the effectiveness and stability of proposed method.Secondly,we combine the feature acquisition process with model building and formalize the cost sensitive feature selection problem as a sequential feature acquisition problem.To overcome the drawbacks of existing methods,an adaptive sequential costsensitive feature acquisition method is proposed.Specifically,we design a reinforcement learning(RL)agent to guide the feature acquisition process and make the architecture of RNN adaptively for each instance.The performance of the former method RADIN could be further improved in this way.The experimental results show that the proposed method could achieve a comparable accuracy with a lower cost compared with other methods.
Keywords/Search Tags:Cost Sensitive, Feature Selection, High Dimensional Data, Reinforcement Learning
PDF Full Text Request
Related items