Font Size: a A A

Research On Neighborhood Rough Set Model For Streaming Feature Selection

Posted on:2024-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZengFull Text:PDF
GTID:2568307064455824Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Neighborhood rough set is a knowledge discovery tool to deal with imprecise information.By replacing the equivalence relation of traditional rough set with neighborhood graining,it solves the problem that it cannot deal with continuous data and is widely used in feature selection.However,in the era of big data,due to the real-time and dynamic nature of data,features are in constant change.Streaming feature selection means that the features flow into the feature space at different times,and the current optimal feature subset is selected in real time in the environment of streaming feature.The dynamic nature of data feature space brings new problems and challenges to the neighborhood rough set model for streaming feature selection:(1)In the high-dimensional dynamic streaming feature environment,the existence of noise data is inevitable.The traditional neighborhood rough set cannot identify noise points in time and obtain a feature subset of higher quality in real time.(2)In the flow feature environment with complex class hierarchy,the assumption of relative independence of categories is destroyed,and the traditional neighborhood rough set cannot deal with the streaming feature selection problem with complex data structure in real time.In order to explore the neighborhood rough set model for stream feature selection,in this thesis,based on the traditional neighborhood rough set,dynamic feature space is constructed with the concept of streaming feature,and complex data labels space are constructed with tree hierarchy.The main research contents are as follows:(1)Anti-noise neighborhood rough set for streaming feature selection.To solve the noise problem of streaming feature selection,this thesis proposes online feature selection based on anti-noise neighborhood rough set.Firstly,the neighborhood relation of noise resistance is proposed,and the distinguishing ability of different samples to similar samples is calculated to determine the size of neighborhood.On this basis,the calculation method of approximation and dependence degree in neighborhood rough set is redesigned.Finally,a new method of online correlation analysis and online redundancy analysis is designed based on the information provided by each category.Experimental results on 8 data sets and 3 classifiers shows that the algorithm is superior to the existing online stream feature selection algorithm.(2)Category hierarchical neighborhood rough set for streaming feature selection.Aiming at the problem of class hierarchy of streaming feature selection,this thesis proposes online feature selection based on hierarchy class neighborhood rough set.Firstly,the hierarchical nearest neighbor domain relation is proposed by making full use of the hierarchical structure information of class,and the influence of neighborhood granularity of different levels is considered comprehensively.Secondly,the hierarchical dependence degree is designed based on the hierarchical nearest neighbor domain relation.Finally,an online stream feature selection framework based on hierarchical neighborhood rough set is designed according to hierarchical dependency and a hierarchical class neighborhood rough set for stream feature selection is obtained.A lot of comparative experiments carried out through hierarchical data sets and flat single label data sets in different fields shows that the algorithm is superior to the existing online stream feature selection algorithm.
Keywords/Search Tags:neighborhood rough set, streaming features, feature selection, noise point, hierarchical classification
PDF Full Text Request
Related items