Font Size: a A A

Novelty Detection Based On Robust Sparse Coding And Stacked Robust Sparse Autoencoder

Posted on:2021-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiuFull Text:PDF
GTID:2428330620970563Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Novelty detection is regarded as a research focus in the fields of machine learning and pattern recognition.It can successfully identify the novel data from the test set which are absent in the training phase.In the real-world applications,novel data are usually absent,extremely rare or not well-defined.Hence,one-class classifiers are fit for dealing with these novelty detection problems.However,similar to the scenarios of two-class classification and multi-class classification,one-class classifiers need to overcome the dilemma of 'curse of dimensionality',i.e.,as the number of features increases,the number of samples grows exponentially to make a classifier obtain the same generalization performance.To deal with the problem of curse of dimensionality faced by novelty detection,one feasible strategy is mapping the high-dimensional samples into the low-dimensional subspace,which is justly feature extraction in machine learning.Therefore,the efficiency of feature extract is the key to success or failure for applying novelty detection approaches to deal with the high-dimensional samplesIn recent years,sparse coding and sparse autoencoder are considered as two popular feature extraction methods.They both achieved lots of researches and attentions.Sparse coding can effectively decrease the redundancy among the feature set,while sparse autoencoder can extract the abstract features within the given samples The two methods can both improve the classification performance of the traditional novelty detection methods and overcome the problem of curse of dimensionality.In this thesis,the above conventional sparse coding and sparse autoencoder are both modified to make them fit for tackling the novelty detection problems.The main contributions of this thesis include the following two aspects1.Robust sparse coding based on correntropy and Logarithmic penalty function is proposed.The conventional sparse coding is only fit for dealing with the Gaussian noise When the noise within the training set obey the non-Gaussian distribution,sparse coding cannot obtain the accurate coefficient vectors.To make sparse coding simultaneously fit for dealing with the non-Gaussian noise and enhance the sparseness of coefficient vectors,correntropy is utilized to substitute the reconstruction error term while Logarithmic penalty function is introduced to replace l1-norm.Furthermore,the obtained coefficient vectors are used as the inputs for the novelty detection method.In addition,the generalization error bound of the proposed robust sparse coding is derived Moreover,the efficiency of the proposed method is validated on the UCI benchmark data sets.2.Robust stacked sparse autoencoder based on Transformed-l1 penalty function and l2.1-norm is proposed.The traditional sparse autoencoder uses KL divergence as its regularization item.However,the sparsity parameter of sparse autoencoder has to be set manually.To avoid the uncertainty caused by setting parameter manually,Transformed-l1 penalty function and l2,1-norm based composite regularization term is introduced to substitute KL divergence in the conventional sparse autoencoder Transformed-l1 penalty function can eliminate the unnecessary connections between neurons among autoencoder,while l2,1-norm can remove the redundant neurons Therefore,the model parameters can be effectively reduced and the training efficiency can be greatly improved.However,for autoencoder with only one hidden layer,after completed feature extraction,the obtained low-dimensional features possess poor representation ability.Therefore,basing on the proposed robust sparse autoencoder,the stacked robust sparse autoencoder is constructed.To sufficiently utilize the features in different levels obtained by the proposed stacked robust sparse autoencoder,the idea of ensemble is introduced.These features in different levels are used to train several one-class classifiers.The output values of the given samples are determined by the majority voting strategy.Finally,the performance of the proposed method is validated on the MNIST handwritten digit database and UCI benchmark data sets.
Keywords/Search Tags:Novelty detection, Sparse-coding, Correntropy, Non-convex regularization term, Sparse-autoencoder
PDF Full Text Request
Related items