Font Size: a A A

Sparse Learning And Its Applications In Data Mining

Posted on:2017-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:D B ChengFull Text:PDF
GTID:2348330488975449Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is usually used to handle data that contains a lot of noise samples and high dimension attributes and so on. In view of sparse learning can well reflect the relationship between the data. Moreover, sparse learning is able to assign a large weight for the related samples or attributes, and assign the little or zero weight for the unrelated samples or attributes. Therefore, in this paper, we study and extend the framework of sparse learning for dealing with the data classification and handling the data of feature selection.Specially, we describe the detail of the main contents as follows.(1) We propose an improved Decision Tree k Nearest Neighbor Classification algorithm based on sparse learning(DTkNNC). Due to kNNC algorithm is easy to implement and effective highly, it is widely used in data classification problem. But kNNC algorithm has three drawbacks:(i), kNNC algorithm of k value is hard to obtain; (ii), kNNC algorithm of the fixed k value can not guarantee the classification results for data classification; (iii), many improved kNNC algorithms not give full consideration to the global information of data. As a result, the third chapter, the DTkNNC algorithm is proposed to integrate the sparse learning and sample self-representation into a unified framework, and combined with decision tree technology. The DTkNNC algorithm uses sparse learning to solve the difficult problem of kNN algorithm about the fixed k value, and uses samples self-representation to considering the global information of data to improve the effect of the algorithm. And then, the DTkNNC algorithm has the low time complexity of the algorithm by using decision tree algorithm to speed up the defects and improve the poor classification result. In real data simulation experiments, the performance of DTkNNC algorithm is better than common ADkNN, LMNN, kNNC algorithms. To a certain extent, the proposed framework based on sparse learning objective function not only enriched the existing sparse model framework also expanded the scope of its application at the same time, namely the sparse learning is applied in data classification.(2) We present a novel Graph sparse learning for Feature Selection algorithm based on Subspace learning (short for GS_FS). Feature selection is a kind of very common data mining and machine learning algorithms to deal with high-dimensional data. However, the existing feature selection methods exist the following defects:simply for all features according to the rules of one sort or simply by sparse learning obtain the important relationship between features, not well considering the correlation between features. In this paper, the fourth chapter we propose to use two kinds of subspace learning algorithm (linear discriminant analysis (LDA) and local projection (LPP)) to consider global and local relationship of data, at the same time embed subspace learning algorithms into the feature selection framework based on sparse learning. The method not only solves the above problem, in the real data simulation experiments, the performance is better than NFS, PCA, LDA, LPP, LE, L21 method. To a certain extent, the proposed framework based on sparse learning objective function not only enriched the existing sparse model framework, but also expand the application scope, the sparse learning is applied in the feature selection of high-dimensional data.In this paper, we focus on two areas of data mining fields with kNNC algorithm about the k value is hard to obtain and feature selection algorithm of high-dimensional data. Moreover, we proposed to use sparse learning theory and methods to effectively deal with these exist drawbacks of two aspects. Therefore, we present two effective data mining algorithms. Each algorithm with the public's real experimental datasets verification, in terms of each evaluation, the proposed algorithms are superior to the classical algorithms.
Keywords/Search Tags:Data Mining, sparse learning, sample self-representation, decision tree, subspace learning, feature selection
PDF Full Text Request
Related items