Font Size: a A A

Feature Selection Based On Relation Information Entropy

Posted on:2017-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z DongFull Text:PDF
GTID:2180330485956808Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In information explosion, with the development of information technology,big data in a variety of fields is particularly complex. When dealing with data,we often encounter uncertain information or fuzzy information, then it is necessary to make an accurate judgment of the existing information. Rough set theory and Fuzzy set theory is the theory which is used to deal with uncertain and fuzzy information. In recent years, these theories have occupies no small position in data mining, machine learning, pattern recognition and other aspects.It has become a research direction of many scholars, and has also been extended to more areas, and achieved many practical results.The overall idea of this paper is to combine the Shannon entropy theory and fuzzy rough set theory, then we propose new definitions of neighborhood relation entropy and fuzzy relation entropy. And their properties are discussed in detail. Finally, we analyze experiment results. Specific work is as follows:1. Neighborhood is one of the most important concepts in classification learning and can be used to distinguish samples with different decisions. In this paper, we propose a neighborhood relation entropy to characterize the distinguishment information of a neighborhood relation. It reflects the distinguishment ability of a feature subset. The proposed neighborhood relation entropy is computed by considering the cardinality of a neighborhood relation.We also generalize and introduce the change of the distinguishment information which is caused by the combination of several feature subsets, that is, the joint entropy, conditional neighborhood relation entropy and mutual information of neighborhood relation are proposed. A parameter, named neighborhood radius,is introduced in these discrimination measures to make them suitable for analysis of real-valued data sets. Based on the proposed discrimination measures,the significance measure of a candidate feature is defined and a greedy forward algorithm for feature selection is designed. The data sets selected from UCI are used to compare the proposed algorithm with some existing algorithms, and the experimental results show that the discrimination index based algorithms yield better performance than some classical ones.2. Redefining fuzzy relation with distance function. Proposing the definitions of fuzzy relation information entropy, fuzzy relation joint entropy,conditional fuzzy relation entropy and fuzzy relation mutual information. And discussing their properties. In addition, the influence of the neighborhood radius and the attribute subset on the fuzzy relation entropy is discussed. Based on the above theoretical research and theory, we design a feature selection algorithm based on fuzzy relation entropy and carry out experimental verification.Experiments show that the proposed algorithm not only reduces the complexity of the sample reduction and improve the classification accuracy of the sample,but also reduces the reduction time. it has certain practical significance.Experiments show that the proposed algorithm not only reduces the complexity of the sample reduction and improve the classification accuracy of the sample,but also reduces the reduction time, it has certain practical significance.
Keywords/Search Tags:information entropy, fuzzy set, rough set, neighborhood relation, fuzzy relation
PDF Full Text Request
Related items