Font Size: a A A

Rough Set Based Mechanisms And Algorithms For Incremental Attribute Reduction

Posted on:2018-06-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y YangFull Text:PDF
GTID:1318330518958113Subject:System analysis, operations and control
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,the collected data sets are generally large-scale in size,high in dimension,mixed in type,and dynamic in time.It is a hot topic of machine learning to efficiently select informative attributes from such data sets,which solves the curse of dimensionality,as well as improves the performance of learning algorithms.Classical rough set based attribute reduction can effectively delete redundant attributes from symbolic datasets by preserving the consistency between condition attributes and decision labels.Due to its effectiveness,classical rough set has been extended to handle high-dimension data sets with different types of or even mixed attributes.However,for large-scale and dynamic data sets,exsiting algorithms often consume an amount of runtime,and even are impractical due to out of memory under some software and hardware environment.To fill such gap,this dissertation focuses on investigating the incremental mechanisms for attribute reduction from symbolic,real-valued,mixed dynamic data sets with high dimension and large size.The main innovative work includes the following aspects:(1)Active sample selection is designed to filter out useless samples with the space and time efficiency.Useful samples are selected to update the reduct by the incremental attribute reduction process that determines which attributes to be added into and deleted from a current reduct.Active sample selection is integrated into the attribute reduction process,yielding the incremental algorithm for attribute reduction with active sample selection.Experimental evaluations demonstrate the effectiveness of our incremental algorithm in significant savings of memory space usage and runtime.(2)Two Boolean row vectors are introduced to characterize the reduct in variable precision rough sets.By all minimal elements of discernibility matrix,an algorithm is developed to perform attribute reduction with variable precision rough set.With the arrival of a sample,minimal elements are updated to investigate the attribute reduction process by revealing which attributes to be added into and/or deleted from a current reduct.Our incremental algorithm is designed by the adoption of attribute reduction process.Experimental comparisons validate the effectiveness of our proposed incremental algorithm.(3)The relative discernibility relation is introduced to characterize the core and reduct with general fuzzy rough sets.By preserving the relative discernibility relation of the attribute set,an algorithm is developed to find a reduct from a real-valued dataset.Based on the non-incremental algorithm,we investigate the incremental perspective for fuzzy rough set based attribute reduction assuming data can be presented in sample subsets one after another.With sample subsets arriving sequentially,the relative discernibility relation is updated to investigate the strategies of adding and deleting attributes.By the strategies,two incremental methods for fuzzy rough set based attribute reduction are designed: 1)updating the relative discernibility relations and the reduct as sample subsets arrive sequentially,and returning the reduct after all sample subsets are processed;2)updating the relative discernibility relations as sample subsets arrive sequentially,and then finding the reduct after all subsets have been added.Experimental comparisons suggest our incremental algorithms expedite fuzzy rough set based attribute reduction.(4)A discernibility relation is defined for each symbolic and real-valued condition attribute to characterize its discernible ability related to decision labels.With these discernibility relations,a dependence function is defined to measure the inconsistency between heterogeneous attributes and decision labels,and attribute reduction aims to keep this dependence function with a small perturbation.The relative discernibility relation is introduced to develop an algorithm for finding a reduct from heterogeneous datasets.The relative discernibility relation is updated to study the strategies of adding and deleting attributes.Based on the strategies,an incremental algorithm is developed to compute a reduct from heterogeneous dataset.Experimental results demonstrate that attribute reduction for heterogeneous data does realize the mutual substitution between symbolic and real-valued attributes,and the incremental algorithm significantly improves the time efficiency of finding a reduct from heterogeneous data.(5)The relative discernibility relation is introduced to characterize the core and reduct with covering rough sets.An algorithm is designed to compute a reduct from a mixed dataset with symbolic,real-valued,missing-valued attributes.The relative discernibility relation of each attribute is updated to reveal the strategies of adding and deleting attributes.Based on the proposed strategies,two incremental processes are designed: 1)updating the relative discernibility relation and the reduct upon a subset arriving,where no subsets arrive resulting in the reduct;2)only updating the relative discernibility relation with a subset arriving,where no subsets arrive yielding the final relative discernibility relation which is used to find the reduct from the whole dataset.Experimental results show that the two incremental processes greatly reduce the runtime of finding a reduct from a mixed dataset,and the second one is more efficient.
Keywords/Search Tags:Rough set, attribute reduction, incremental learning, dynamic dataset, relative discernibility relation, minimal element
PDF Full Text Request
Related items