| Traditional feature selection techniques,as the main dimensionality reduction methods,have been widely used in machine learning and data mining,demonstrating strong performance and problem-solving ability for traditional data types.However,they face challenges when dealing with an increasing number of features that exist in a streaming fashion.For example,in many fields,features continue to increase over time.In such data application scenarios,traditional feature selection are unable to extract suitable features.Research on online feature selection for streaming features has become an important research topic in recent years.As an extension of the classic rough set theory,neighborhood rough sets are suitable for high-dimensional,continuous mixed data and have been extensively used in feature selection.Moreover,neighborhood rough sets do not require prior knowledge and have great advantages in solving online streaming feature selection problems.Therefore,studying online streaming feature selection based on neighborhood rough sets has practical application significance.In order to further improve the performance of online feature selection based on neighborhood rough sets,the main research work of this paper includes:(1)To reduce the time cost and the need for manually setting the radius parameter when using neighborhood rough sets for feature selection,a novel online streaming feature algorithm called "Online Feature Selection Based on Adaptive Neighborhood Radius and Buffer Zone"(OFS-ANR-BZ)is proposed.The algorithm introduces a buffer zone and pre-screens features based on Fisher score.An adaptive neighborhood radius is defined based on the standard deviation of distances between different feature samples,which improves the theory of adaptive neighborhood rough sets.To validate the effectiveness of the algorithm,we selected eleven real-world datasets and conducted experiments.The experimental results demonstrated that the feature subset selected by the proposed OFS-ANR-BZ algorithm had a smaller size and achieved superior classification performance.(2)In order to measure both uncertainty and fuzziness in the neighborhood rough set,a new online stream feature selection algorithm based on adaptive neighborhood relative dependency complement mutual information and buffer zone(OFS-ANRDCMIBZ)is proposed by combining neighborhood feature dependency and neighborhood complement mutual information.The algorithm defines an adaptive neighborhood relative dependency complement mutual information and proposes a new feature importance measure based on it,and designs a feature selection framework based on adaptive neighborhood relative dependency complement mutual information.This framework is combined with the buffer zone pre-screening based on Fisher Score to optimize the OFS-ANR-BZ algorithm and form the OFS-ANRDCMI-BZ algorithm.Experimental results on datasets show that the selected feature subset by the OFS-ANRDCMI-BZ algorithm can achieve small subset size and excellent classification performance.At the same time,the OFS-ANRDCMI-BZ algorithm is significantly better than the OFSANR-BZ algorithm. |