Font Size: a A A

Research On Clustering Methods Based On Unsupervised Deep Feature Learning

Posted on:2022-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:J C ZhaoFull Text:PDF
GTID:2518306512961949Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the coming of information age,the exponential growth speed of unlabeled data such as images,voices,videos,etc.have owned a huge amount of information,and have become a common data type in daily life and scientific research.Manual labeling can be very timeconsuming and expensive,so clustering analysis of image data is one of the important tasks in data mining and image processing fields.Unsupervised and self-supervised learning combined with deep networks,features representation,optimization strategies and other technologies that have been continuously developed in recent years,which can extract better information rules contained in the data and perform classification for the data.Therefore,this article will focus on the clustering algorithms based on unsupervised deep visual feature learning.Specifically,the main research contents of this article include the following 3 points:1.In order to improve feature representation ability of the autoencoder and capture global features of high-dimensional and low-quality images,a deep clustering method for learning visual features(ASCAE)is proposed,which the proposed asymmetric structure of convolutional autoencoder is used to learn the features representation,and then K-Means algorithm is used to perform clustering analysis on the obtained features.The ASCAE network also employs two improvements including i)a variable-step convolutional layer and ii)the reconstruction error of the fully-connected to constrain the reconstruction error of the network.These further improve the features representation ability of the network.The experimental results show that the ASCAE network has better feature representation capability which makes K-Means providing better clustering results.The clustering accuracy on the MNIST dataset is0.918,which is higher than the advanced deep clustering method DEC 7 percentage points.2.In order to further improve features representation capability of the clustering network and make visual features suitable for several image tasks,a deep Softmax clustering method based on a novel convolutional autoencoder(ASCAE-Softmax)is proposed,which features learning and clustering discrimination are performed together.An asymmetric convolutional autoencoder network(ASCAE)is used to extract features,Softmax maps the features into probability distribution that is constructed auxiliary probability distribution,and iteratively minimizes KL divergence loss to obtain a clear clustering division.Experimental results show that this method can learn features that make the intra-cluster data more compact while the inter-clusters far from each other.ASCAE-Softmax provides a clustering accuracy of 0.960 on the MNIST dataset and this is better than the state-of-the-art deep clustering algorithms,and the clustering accuracy of 0.755 on the COIL-20 dataset that is higher 3% than that of DEC and DBC methods.3.Inspired by the good performance of unsupervised learning based on mutual information,a self-supervised clustering learning method based on bi-mutual information maximization(bi-MIM-SSC)is proposed.First,maximizing the mutual information between an input image and its hidden feature,which can convert the high-frequency information of the original image to the feature representation in the latent space;and then transfer to the final features with the help of maximizing the mutual information between output feature pairs,so that they have more semantic information.In order to further improve the disentangled ability of the feature layers,the network is pretrained and an auxiliary over-clustering layer is added.The experimental results show that the proposed bi-MIM-SSC algorithm can provide better features,and make the output features approach to class-feature vectors,then results in better clustering results than other deep clustering algorithms.The bi-MIM-SSC method can achieve a clustering accuracy of 0.719 on the CIFAR10 dataset,which is 12% higher than the strongest competitor IIC method.It can also be 10% higher on the CIFAR100-20 dataset(0.356 vs.0.257).
Keywords/Search Tags:Unsupervised learning, Clustering analysis, Features representation, Convolutional encoder, Mutual information maximization
PDF Full Text Request
Related items