Font Size: a A A

A Research Of Clustering Method Based On Incremental Learning

Posted on:2022-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:M S YangFull Text:PDF
GTID:2518306764471974Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the continuous implementation of artificial intelligence scenarios,big datadriven deep learning networks can quickly occupy the market and drive industry applications through training of big data-driven deep learning networks.Most of the massive data samples downloaded from the Internet and used for deep learning are unlabeled,and manually labeling these samples is an extremely time-consuming,labor-intensive and expensive task.As a key technology of unsupervised learning,clustering based on data self-expression can help people to perform clustering,and has been widely used in many fields such as product recommendation,anomaly detection and text analysis.At present,most studies are looking for the optimal clustering of static datasets,and there are few studies on clustering of dynamic datasets under incremental conditions.However,the data in the real world continues to develop,and the clustering model that needs to repeatedly integrate the data and train the model can no longer meet the practical needs such as limited storage resources and frequent database changes.Therefore,the study of dynamic datasets is a must and at the same time extremely challenging work.In order to solve the clustering problem of dynamic datasets,we start from the relevant practical application scenarios,and carry out researches on the increase of data in batches and single forms respectively.The purpose is to build two clustering frameworks for different incremental scenarios and improve the clustering.The accuracy of the class results.The related work of thesis is summarized as follows:1.First of all,we introduce the types of clustering and the basic knowledge of algorithms,and focus on the related principles and technology development process based on data blocks and data streams.Finally,we analyze the performance and efficiency of the algorithm.shortcoming.2.Secondly,when the number of clusters is known,a deep adaptive batch incremental clustering algorithm based on internal information mining is proposed.This method reconstructs the clustering problem into a binary classification problem by mining a variety of internal relevant information between different data and before and after the same data enhancement,and combines representation learning and clustering with end-to-end deep neural network.Finally,through multiple sets of experiments,it is proved that the proposed algorithm can dynamically adjust the cluster center and greatly improve the clustering accuracy in the process of gradually increasing the data in batches.3.Finally,when the number of clusters is unknown and the initial data volume is zero,a real-time clustering method based on single-sample increments is proposed.In order to solve the problem of deep learning without initial data and extracting a good deep representation,we introduce transfer learning technology to transfer the deep network model trained on the pure annotated source domain that is related to the target dataset and has a relatively complete label and uses it As the feature library of the target domain,and on this basis,the improved Single-Pass algorithm is used to complete the real-time clustering of the data stream.
Keywords/Search Tags:Incremental Clustering, Real-time Clustering, Transfer Learning, Convolutional Neural Networks
PDF Full Text Request
Related items