Research And Application Of Incremental Clustering Algorithm Based On Auto-Encoder

Posted on:2017-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:Z N Yang

Full Text:PDF

GTID:2348330488959961

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of sensing technologies and wireless communications, data are continuous generated and accumulated rapidly. The real-time processing of dynamic data and analysis of availability have captured widespread attention. How to incremental clustering on dynamic data sets and impute the incomplete data efficiently to improve the availability of data sets have become a hot topic of academic research.However, most of the existing incremental clustering algorithms do not learn the main features of the data sets, cannot achieve good performance on data sets of high-dimensional. And most of the existing incomplete data imputation algorithms do not consider the local similarity between these samples which cannot guarantee the accuracy of imputation. Aiming at these problems, this paper proposes an incremental clustering algorithm based on auto-encoder, incremental clustering on dynamic data by learning the main features of data sets. Then, based on this algorithm, this paper take use of the idea:filled by similarly of local data to fill the incomplete data by weighted value of other complete data in each class. The specific work is as follows:(1) An incremental clustering algorithm based on auto-encoder. Firstly, the auto-encoder is used to learn the main features of the data sets, get representation of new feature space from the raw data. To read the data set once and run incremental clustering on the new data sets base on the original clustering results.(2) An incomplete data imputation algorithm based on incremental clustering. Firstly, filling the missing features of incomplete data sets with special values to get the initial complete data set, then taking use of the incremental clustering algorithm based on auto-encoder to learn the main features of the data sets and fast clustering on the data sets to get clustering results. During the last phase, the top k% nearest-neighbors hybrid distance weighted imputation is approached to fill in missing values in clusters.Experimental results show that the proposed incremental clustering algorithm based on auto-encoder can achieve good performance on dynamic data sets effectively by adjust the structure of clusters dynamically. Then, the proposed incomplete data imputation algorithm based on incremental clustering algorithm can impute the missing features effectively and efficiently which can achieve good time performance. Moreover, these two algorithms are suitable for distributed data processing frameworks which have good expansibility.

Keywords/Search Tags:

Incremental Clustering, Incomplete Data, Data Imputation, Auto-Encoder

PDF Full Text Request

Related items

1	Research And Application Of Incomplete Data Imputation Algorithm Based On Subtractive Clustering
2	Research And Application Of Incomplete Data Imputation Algorithm
3	Studies On Missing Data Imputation
4	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Improved BP Imputation
5	Research On Incomplete Data Stream Imputation Method For Internet Of Things
6	Research And Implementation Of Incomplete Data Processing Based On AP Clustering
7	Research And Implementation Of Imputation Method For Single-Cell Transcriptome Sequencing Data
8	Research On Incomplete Data Imputation In Sensor Networks
9	Attribute Correlation Modeling And Missing Value Imputation Of Incomplete Data Based On Fuzzy Partition
10	Several Studies To Improve Deep Data Imputation