Font Size: a A A

A Study On Image Clustering Algorithms With Deep Neural Networks

Posted on:2021-06-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:X F GuoFull Text:PDF
GTID:1488306548491784Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data clustering is the most basic and important tool for data analysis.It is of great significance for helping us arrange,summarize,and store data to quickly and intelligently cluster the massive data.With the development of big data and artificial intelligence,traditional clustering algorithms fail to meet the practical requirements,which makes a new active research direction the clustering algorithms based on deep neural networks,i.e.,deep clustering.However,existing deep clustering algorithms suffer from the problems of degenerate solution in objective functions,low generalization,unstable training,and weak representation ability of unsupervised neural networks.How to design deep clustering algorithms with impressive representativeness,generalization,and stability has become one of the most important problems to solve in artificial intelligence.This thesis is devoted to solving these problems in the aspects of objective function,data augmentation,selfpaced learning,equivariant features,and self-supervised learning,in order to improve the quality of the learned features and final clustering performance.The main contributions are summarized as follows:1.Based on the ability of local structure preservation of under-complete autoencoders,we propose a deep convolutional embedded clustering algorithm with local structure preservation,which can solve the degenerate solution problem.Many deep clustering algorithms update the parameters of both neural network and clustering model by using one solo clustering loss,which exists a degenerate solution,i.e.,all samples are transformed by the neural network to a single point in the feature space and such a solution is the minimum of the loss function.But the resulting features are meaningless and the clustering result is useless.To solve this problem,we propose to preserve the local structure of data in feature space by an under-complete autoencoder.We train the deep clustering model by the reconstruction loss of the autoencoder and the KL divergence loss,with the former loss preserving the local structure and the latter encouraging clustering-oriented features.The clustering performance can keep increasing when the model is trained by these two losses jointly.Furthermore,we design a convolutional autoencoder with layers close to the data being convolutional and the embedding layer being fully connected.This convolutional autoencoder succeeds in effectively extracting features from image data and preserving the local structure simultaneously.Experiments on image and text datasets validated the importance of the local structure preservation and the effectiveness of the proposed deep embedded clustering algorithm.2.We propose a deep embedded clustering framework based on data augmentation by introducing data augmentation technique into unsupervised learning to solve the problem of deep clustering performing poorly on small scale datasets.Deep neural networks require a large amount of samples to learn meaningful and discriminative features.When only a few samples are available,deep neural networks may fail to learn good features.Deep clustering algorithms can not bypass this problem since they take deep neural networks as their representation learning models.To solve this problem,we introduce the data augmentation technique into unsupervised deep clustering task.This is the first work to do so,to the best of our knowledge.We first summarize some existing deep clustering algorithms and propose the deep embedded clustering framework,which is based on an autoencoder and a certain clustering model attached to the embedding layer of the autoencoder.This framework consists of two stages: pretraining by the autoencoder and finetuning by the joint losses.We give the ways of incorporating data augmentation into both stages,and analyze the reasonableness in the perspectives of manifold learning and supervised learning.Five specific deep clustering algorithms are instantiated from the framework of deep embedded clustering with data augmentation.Extensive experiments have validated the importance of the data augmentation in unsupervised deep clustering field.Our five algorithms achieved the state-of-the-art clustering performance on four image datasets.3.We propose a deep clustering algorithm based on adaptive self-paced learning and data augmentation to eliminate the influence of marginal noise samples on the stability of model training.Deep clustering algorithms commonly alternatively perform feature learning and clustering.However,the clustering result can not be 100% accurate.When the inaccurate result is used as supervisory signal to tune the feature learning model,the neural network can be misguided,and the misguided features in turn harm the clustering performance,leading to instability of the model training.To solve this problem,we select only confident samples to train the neural network by using self-paced learning.In this way,the unreliable samples can not influence the training because they will never be chosen for training.We further design an adaptive self-paced learning algorithm with the hyper-parameters replaced by the statistics of data during training.In addition,we design a clustering loss without degenerate solution by fixing the cluster centers of KMeans' loss function.At last,data augmentation is again used to improve the generalization of the model and robustness of the learned feature.We have conducted extensive experiments and validated the effectiveness of the proposed method.Ablation study shows the performance improvement of data augmentation and adaptive self-paced learning.4.We propose a clustering algorithm based on affine equivariant autoencoder to study how discriminative the equivariant feature is.Deep clustering relies on existing unsupervised deep neural networks to learn and extract features from data.But existing unsupervised neural networks can not satisfactorily deal with image data and they all focus on learning transformation invariant features.We discover the equivariant features can better represent the intrinsic properties of data.In other words,to preserve the equivariance,neural networks have to learn the transformation patterns in data and understand the content of data.Therefore,the equivariant feature can represent more information and be more discriminative than the invariant feature,expected to lead to better clustering performance.We propose an affine equivariant autoencoder to learn,in an unsupervised manner,features that are equivariant to the affine transformation.The proposed objective is composed of the reconstruction of the original samples,the reconstruction of the affine transformed samples,and the approximation of the affine transformation function.These two reconstruction losses make sure the encoder a valid feature extractor for the original samples,and the approximation encourages the encoder to encode equivariant features.We carefully design experiments to validate the equivariance quantitatively and qualitatively,where the qualitative validation is achieved by observing the reconstructed images with some noises added to the embedded features,and the quantitative validation is directly by the definition of the equivariance.Experiments have validated that the proposed model can learn equivariant and discriminative features.By performing spectral clustering on the learned features,we can obtain significant clustering performance.5.To tackle the ineffectiveness of unsupervised deep neural networks on complex natural images,we propose a self-supervised representation learning algorithm based on image translation transformation.Self-supervised learning constructs supervisory signals by predicting the type of transformations applied to data.It has been proven in various applications that this kind of supervision can train more complex networks and learn more discriminative feature from complex image data.But existing geometric-based selfsupervised learning algorithms suffer from the artifact generated during transformation.To solve this problem,we design a new self-supervised task by predicting the number of translated pixels.For the same translation direction,we apply a same mask to eliminate the effect of artifacts.Experiments prove that the proposed method can learn discriminative feature from complex image datasets.The proposed model can serve as the feature learning model for deep clustering algorithms to improve their performance on natural images.
Keywords/Search Tags:Deep Clustering, Deep Neural Network, Local Structure Preservation, Data Augmentation, Self-paced Learning
PDF Full Text Request
Related items