Font Size: a A A

Researches Of Rough Set Model And Feature Selection For Numerical Data

Posted on:2019-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:F XuFull Text:PDF
GTID:2348330542997624Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Rough set theory,born in 1980s,is an important new data processing tool,especially showing significant performance in uncertainty and incomplete data processing.Equivalence relation is regarded as the core of rough set theory and research objectives are rough approximated by granulating data,thus a clear understanding of fuzzy uncertainty knowledge is achieved.The feature selection of information system is an important application of rough set theory.Neighborhood rough sets and fuzzy rough sets are two important branches of the whole rough set theory system.Meanwhile,these two models are powerful tools for processing numerical data.As numerical data is a common data type,the feature selection based on these two models has important research significance.In the neighborhood rough set model,the feature selection based on complete data does not consider the aggregated distribution of data,which brings some errors to the result of feature selection.The feature selection based on incomplete data is mostly based on the construction of tolerance relation,and there are some defects in the description of data similarity.In fuzzy rough set model,feature selection based on complete data does not consider the interval between data classes and classes,so highly good feature results can not be selected.There is less research on the feature selection based on incomplete data,so the study of incomplete data by fuzzy rough sets is a vacancy.In order to solve the problems above,this paper will improve it successively and propose the corresponding feature selection algorithm.The main works of this paper are as follows:(1)Aiming at the defects of neighborhood rough set model about feature selection numerical type of incomplete information system exists,the aggregated distribution of data is evaluated through the variance,and then an adaptive neighborhood granulation is proposed.Adaptive neighborhood fuzzy entropy is proposed under adaptive neighborhood granulation with fuzzy entropy,and is used in attributing importance evaluation numerical information system and constructing a heuristic feature selection algorithm.Finally,simulation experiments show that the proposed feature selection algorithm has higher feature selection performance than other neighborhood feature selection algorithms in numerical complete information system.(2)Aiming at the defects of neighborhood rough set model in incomplete information system on numerical feature selection exist,the valued tolerance relation and neighborhood relations will be combined in this paper,then the neighborhood valued tolerance relation will be proposed,then the conditional entropy model based on neighborhood valued tolerance relation is proposed.In this paper it is called neighborhood valued tolerance conditional entropy,and as an important degree numerical attribute evaluation with incomplete information systems,it further constructs a heuristic feature selection algorithm.Finally,simulation experiments show that the proposed feature selection algorithm has higher feature selection performance than other neighborhood feature selection algorithms in incomplete information system.(3)Aiming at the defects of fuzzy rough set model for feature selection of numerical complete information system at present,large margin learning is introduced in this paper to evaluate the classification attribute margin between class and class,and the fuzzy granulation under fuzzy rough set model is constructed by the results of the learn margin attribute weights,the paper called it the large margin fuzzy granulation.Then,based on the results of large margin fuzzy granulation,we define two metrics of dependency and knowledge granularity,and combine them to evaluate the importance of attribute in information system,so as to construct a heuristic feature selection algorithm.Finally,simulation experiments show that the proposed feature selection algorithm has higher feature selection performance than other feature selection algorithms of fuzzy rough sets in numerical complete information system.(4)Fuzzy rough set models in incomplete information system numerical feature selection are less,the tolerance relation model based on fuzzy rough set is introduced in the study,a model of fuzzy rough sets based on tolerance relation is put forward.Then the fuzzy information gain ratio is defined in the model,and the fuzzy information gain ratio is attribute to the importance of evaluation in information system,at the same as a heuristic feature selection algorithm is proposed.Finally,simulation experiments show that the proposed feature selection algorithm has higher feature selection performance than other related feature selection algorithms in numerical incomplete information system.Therefore,four feature selection algorithms will be proposed in this paper respectively,as they are the numerical characteristics of neighborhood rough set model based on complete information system and incomplete information system feature selection algorithm and numerical fuzzy rough set model based on complete information system and incomplete information system selection algorithm.At the end of this paper,simulation experiment is used to compare the performance of the proposed algorithm.
Keywords/Search Tags:neighborhood rough sets, fuzzy rough sets, information granulation, feature selection, information entropy
PDF Full Text Request
Related items