Font Size: a A A

Research On Dynamic Feature Selection Algorithm Based On Mutual Information

Posted on:2021-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:J WenFull Text:PDF
GTID:2428330626462956Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The process of feature selection is that select features from the original feature set which can maximize the feature evaluation standard function value,and these features will generate the optimal feature subset.Feature selection has always been a key issue in the fields of pattern recognition and data mining.In terms of classification,it is the feature vectors contained in the sample that play a decisive role in the category of the sample.The completeness of samples,the redundancy between features and features,and the correlation between features and class labels,both of them have a profound impact on the classification performance of the learning model.The existence of a large number of irrelevant or redundant features not only reduces the classification ability of the learning model,but also time-consuming.The purpose of feature selection is to select those features that have the strongest correlation with class labels and can provide more new classification information,while eliminating features that are not related to classification or redundant features that cannot provide new classification information.Using feature selection dimensionality reduction technology in the field of pattern recognition can quickly and effectively select features which are useful for classification discrimination,and greatly optimize the classification model architecture,and improve classification performance.Firstly,this paper specifically introduces the relevant knowledge of information theory used by the dynamic feature selection algorithm based on mutual information.Secondly,analyze and study several problems in the process of feature selection algorithm,such as the high complexity of the algorithm,the classification model is complicated and the number of features included in the optimal subset is difficult to determine.At last,two new dynamic feature selection algorithms based on mutual information are proposed.(1)Dynamic Feature Selection Method with Minimum Redundancy Information(MRIDFS),is proposed.After studying the evaluation index of the DCSF classification algorithm,it is found that the algorithm does not distinguish between intra-class redundancy and extra-class redundancy,thereby ignoring the impact of extra-class redundancy on the classification performance of the algorithm,which leads to the calculation of redundant information is inaccurate.Based on this discovery,in order to enable the classification algorithm not only can accurately describe the correlation between features and class labels but also can accurately describe the redundancy between features,MRIDFS dynamically adjusts the importance of extra-class redundancy through the feature-dependent redundancy ratio According to the experimental results on 12 standard data sets,MRIDFS can effectively measure and eliminate redundant features in the feature space,which can effectively improve the classification performance(2)A feature selection algorithm based on dynamic weights and optimized conditional mutual information,namely,Optimizing the minimum bound of CMI based on Dynamic Weights,is proposed.All feature selection algorithms are based on certain mathematical assumptions.DOMCMI adopts weaker mathematical assumptions than existing algorithms to optimize the lower limit of conditional mutual information.Such processing can well distinguish irrelevant features from redundant features,therefore make the classification model avoids selecting irrelevant features.On this basis,the DOMCMI algorithm considers the complementarity between the candidate features and the selected features.The selection of these complementary features will bring more new classification information,therefore,the DOMCMI algorithm gives a dynamic weight to the candidate features,in this way,the importance of candidate features is continuously adjusted.From the experimental results on 12 standard data sets,we can see that the introduction of dynamic weights can effectively improve the classification performance.
Keywords/Search Tags:Mutual information, Feature selection, Feature correlation, Feature redundancy
PDF Full Text Request
Related items