Font Size: a A A

Relief-based Feature Selection Algorithms

Posted on:2019-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:X J HuangFull Text:PDF
GTID:2428330545951194Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the “information explosion” era,the emergence of vast amounts of information and the expansion of data dimension have posed great challenges to pattern recognition of high-dimensional data.Excessive information has become an urgent problem.However,the wealth of information contained in high-dimensional data increases the likelihood of problem resolution.In the face of such “extremely rich data and relative lack of information”,how to get useful information from massive data and to grasp the essence of the problem have become hot spots.Data mining technology can reveal hidden information from a large amount of data and transform massive high-dimensional data into useful information and knowledge.As an important research direction in data mining technology,feature selection can remove the irrelevant or redundant features,so as to reduce the number of features,and to improve model accuracy and reduce running time.From the perspective of data interpretation,the selection of key features can also simplify the model and make it easier for researchers to understand the data generation.In the existing feature selection methods,Relief avoids using any global search or heuristic search methods,and assigns different feature weights according to the relevance of each feature and category,which is a simple and effective feature weighting method.These advantages make Relief widely used in the processing of massive high-dimensional data.In this thesis,we focus feature selection based on the Relief algorithm.Contributions of this thesis are described as follow.A local hyperplane-based dynamic Relief?LH-DR?feature selection algorithm is proposed.The key idea behind LH-DR is to design a feature weight dynamic representation framework based on margin maximization,which redescribes the optimization problem.The dynamic representation framework can clearly show the dynamic process of weight iteration and reveal the relationship between the expected margin and the weight of the target.Moreover,the existing Relief-based feature weighting algorithms can be unified in the proposed framework.LH-DR could keep the high accuracy and reduce the time consumption at the same time,and is suitable for the feature selection of high dimensional data.Experimental results indicate that LH-DR can speed up the convergence while ensuring the accuracy and greatly reduce the time consumption.A neighbor sparse reconstruction-based dynamic Relief?NSR-DR?is proposed.Under the dynamic presentation framework,NSR-DR adopts a new l1 regularization-based sparse neighbor reconstruction method to represent the nearest neighbor of a given sample.NSR-DR redefines the optimization problem,and introduces neighbor sparseness to better represent the nearest neighbor of a given sample.Experimental results show that NSR-DR is very effective in dealing with high-dimensional data.A weight sparseness-based Relief?WS-Relief?is proposed.In the algorithm,a new optimization objective function based on the l1 regularization is defined to make the solution of final weight vector sparse,which is suitable for processing high-dimensional data.At the same time,the gradient descent algorithm is used to update the weight to ensure the convergence of the WS-Relief algorithm.Experimental results show that WS-Relief not only improves the accuracy,but also picks out the useful features.A min-redundancy-based weight sparse Relief?MRWS-Relief?is proposed.MRWS-Relief is devoted to selecting features with low redundancy and high relevance by introducing the redundancy,which avoids the interference of the feature redundancy to the classification results.Experimental results show that MRWS-Relief can get better accuracy with fewer features.
Keywords/Search Tags:feature selection, Relief feature selection algorithm, local hyperplane, sparse representation, redundancy
PDF Full Text Request
Related items