Font Size: a A A

Research On Feature Selection Method Based On Random Forest

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:W K LiFull Text:PDF
GTID:2518306743474064Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of science and technology,the scale of data has increased exponentially,and irrelevant information will have a negative impact on the accuracy of the model and the training cost.Feature selection,as one of the classic dimensionality reduction techniques,can reduce the computational complexity of model training.Recursive feature elimination is a widely used feature selection method.This method iteratively deletes useless features from the full set of features,and finally obtains the optimal feature subset.However,since only one feature is deleted during each iteration,recursive feature elimination cannot perform feature selection quickly and efficiently when large-scale data needs to be processed.In response to the above-mentioned problems,this paper proposes two feature selection methods.Among them,the flexible recursive feature elimination method based on random forest can remove multiple irrelevant features during each iteration by analyzing the structure of the current feature subset and the performance of the classifier;the non-iterative feature selection method based on random forest combines the penalty function,so that the optimal feature subset can be quickly obtained.These two methods reduce the number of iterations and make up for the shortcomings of the recursive feature elimination method,thereby efficiently completing feature selection.The two feature selection methods proposed in this paper are applied on public data sets,and it is verified that these two methods can efficiently complete feature selection.The experimental results show that compared with the traditional recursive feature elimination method,the flexible recursive feature elimination method based on random forest and the non-iterative feature selection method based on random forest increase the time efficiency by more than 17%.At the same time,the classification task can be completed more accurately with fewer features.
Keywords/Search Tags:Feature Selection, Recursive Feature Elimination, Machine Learning, Random Forest
PDF Full Text Request
Related items