Font Size: a A A

The Research Of Dimensionality On Reduction And Classification Based On L1-norm Maximization

Posted on:2013-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhangFull Text:PDF
GTID:2218330371464623Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
During the 21st century, we have entered the information age, more and more multiple dimensional data followed. Because there is lots of redundant information in it, or we just really care about some features of multiple dimensional data. At the time, researchers attach great importance to dimensionality reduction. Meanwhile, when dealing with class-marked multiple dimensional data, we should make sure classification result would not be messed after important feature of data extracted. Because environment is unpredictable, during the process of collection, preservation and distribution, noise is prone to mix in the dataset more or less. Introduction of noise causes great inconvenience during the work of dimensionality reduction. Even worse, the result obtained also has big deviation. Of course, several methods can be used to try to remove noise during preprocessing; however, noise-robust performance is an important evaluation of an algorithm for dimensionality reduction and classification. General algorithms such as PCA, LPP, use the square of distance to measure similarity of two points. In this way, noise's effect on dimensionality reduction is amplified. On the contrary, algorithms based on L1-norm apply with the absolute distance, anti-noise performance is greatly manifested. This paper has studied on the general merit of L1-NORM, later; improve the existing algorithm with L1-NORM. Finally, a large scale of experiments is carried. As demonstrated by those experiments, the improved algorithm is found competitive to the traditional one in terms of anti-noise performance. Meanwhile, not only in the field of dimensionality reduction, but also in the field of fuzzy clustering should we take into account the influence of noise. During the fuzzy clustering, usually, the distance between the point and a certain clustering center point determines the category of the point. However, as the general distance used in the clustering, Euclidean distance is sensitive to noise. We can choose several kinds of distances which are less sensitive to noise, compose them in some special ways to replace the Euclidean distance. In this way, we can do clustering under the condition of lack knowledge of dataset structure; also, the result is robust to noise. When dealing with classification, especially in the field of transfer learning, the anti-noise performance of algorithm is also very important. Dataset gradually changes over the whole time, when you classify the changed dataset, unfortunately, the tiny changes happen near the optimal hyperplane. The result of classification greatly changes; however, this phenomenon is unreasonable. This paper proposed the algorithm of ESVM. Considering the probability distribution of earlier dataset, ESVM gets a pretty well classification result and well inherits the earlier dataset's experience. To a certain extent, ESVM algorithm shows its anti-noise merits.
Keywords/Search Tags:dimensionality reduction, classification, clustering, transfer learning, anti-noise, l1-norm, absolute distance
PDF Full Text Request
Related items