Font Size: a A A

Research On Feature Extraction And Feature Selection Algorithms Based On Effective Distance

Posted on:2018-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhangFull Text:PDF
GTID:2348330536987928Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In machine learning and pattern recognition domain, feature extraction and feature selection are important approaches to deal with high-dimensional data, which have been widely used in information retrieval, text classification and disease diagnosis. Researches showed that many feature extraction and feature selection algorithms focus on using Euclidean distance to measure the similarity of samples However, Euclidean distance usually ignores the influence of other samples and fails to capture the dynamic structure due to its static characteristics. To reflect the underlying dynamic structure of data,in this thesis, we measure the effective distance of samples by considering the relationship between the target sample and other samples, which based on the global topological structure. Then we propose a set of effective distance-based feature extraction and feature selection algorithms by using the effective distance-based similarity. The main innovation and research of this thesis are as followsOn the one hand, we develop two ways to compute the effective distance of samples, including k Nearest Neighborhood-based effective distance and sparse representation-based effective distance The computation of effective distance is depended on the topological structure of samples. First, we construct one bilateral network using sparse reconstruction relationship of samples or neighborhood relationship of samples. Based on this bilateral network, we can compute the effective distance of two samples. Then we propose effective distance-based feature extraction methods by using the effective distance-based similarity matrix. Experimental results show that effective distance-based feature ex traction algorithms can effectively preserve the structure of samples and achieve better classification performance than conventional methods using Euclidean distance.On the other hand, we firstly obtain the sparse reconstruction relationship of samples through sparse representation, which is used to construct the global topological structure. Then the effective distance of different samples could be measured using the topological structure. In the process of feature selection,the similarity based on effective distance is used to evaluate the importance of features. Besides, we take advantage of the idea of iteration to achieve the optimal feature subset gradually. As a result, we develop the modified iterative feature selection algorithms based on effective distance. Experiments are conducted on a series of UCI data sets and the results indicate that our effective distance-based feature selection methods can select much better features and boost the classification performance.
Keywords/Search Tags:Feature extraction, feature selection, dynamic structure, topological structure, effective distance, similarity, classification
PDF Full Text Request
Related items