Font Size: a A A

Research On Dynamic Feature Selection Algorithm For Flow Features

Posted on:2020-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Y QiFull Text:PDF
GTID:2438330626453257Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The popularity of big data presents some challenges for the traditional feature selection task,meanwhile,some new datatypes also bring about new opportunities for the feature selection research,and streaming feature is one of these new datatypes.It refers to features that are constantly arriving while instance space is fixed.The difficulty lies in the dynamicity of the high-dimensional feature space because the feature space is not or cannot be known in advanced.Overall,the feature space is unknown and evolutionary.Although there have been many feature selection algorithms for streaming features,these algorithms have their drawbacks.Firstly,once the existing streaming feature selection algorithms determine that the feature is redundant,it would be removed.However,the redundant feature that is deleted can still improve the predictive performance owing to the changing feature space.In order to solve this problem,this paper proposes an online streaming feature selection algorithm based on a fixedsize buffer pool.To be specific,the algorithm dynamically preserves and retrieves features through the buffer pool to process the changing feature space,and combines two different types of feature selectors to improve the predicted performance and compress the feature space.After that,the proposed algorithm is compared with the existing stream feature selection on 12 classic datasets,and experiments prove that this algorithm can obtain more excellent classification accuracy and spatial compression ratio.Secondly,the Grafting algorithm is a classical streaming feature selection algorithm based on sparse regularization.The starting point of many improved algorithms based on Grafting is to improve the search strategy by different regularization term,but lack the improvement of measurement standards of features.The existing streaming feature algorithms neglect to measure the new feature's own discriminative ability.This paper adds the constraint of the new feature representation ability based on the original objective function of the Grafting algorithm,and derives a new model for streaming features by minimizing reconstruction residuals for each newly added features.In this paper,the boosting Grafting algorithm is compared with other typical streaming feature selection algorithms and the above-mentioned algorithm based on a buffer pool.The experimental results show that the improved algorithm proposed in this paper is obviously competitive on accuracy,spatial compression and the algorithm stability.The research source code,dataset,and experimental results of this paper are all open source,and the link is: https://github.com/qixuejun/online_feature_selection.
Keywords/Search Tags:Streaming Feature, Feature Selection, Data Mining, Online Learning
PDF Full Text Request
Related items