An Automatic Data Clustering Method Based On The Evolutionary Computation

Posted on:2021-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:J X Chen

Full Text:PDF

GTID:2428330611466933

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of science and technology,enormous data are generating and accumulating in the real life,which encourage people to mine the information and value behind them.Through the clustering analysis,the data distribution and the underlying structure help people to better understand and solve the problems in reality.In many practical applications,it is crucial to perform automatic data clustering with unknown cluster number.The evolutionary computation paradigm is good at dealing with this task,but the existing algorithms encounter several deficiencies.In this paper,we propose a novel elastic differential evolution algorithm E-DE to solve the automatic data clustering problem.Unlike traditional methods,the proposed algorithm considers each clustering layout as a whole to evolve.We adopt a variable length encoding scheme,which encodes the cluster centroids that take effect during the clustering process.The encoding has no redundancy that it enhances the search efficiency.To enable the individuals of different lengths to exchange information properly,we develop a two-phase mutation operator and a subspace crossover.The mutation first determines the cluster number by the differential information in different individuals.Then,a Gaussian disturb is taken to fine tune the cluster centroids.In the crossover,a selected chromosome segmentation constructs the subspace for exchanging the information between the target and mutant vectors in order to generate a new trial vector.The operators employ the basic method of differential evolution and,in addition,they consider the spatial information of cluster layouts to generate offspring solutions.Particularly,each dimension of the parameter vector interacts with its correlated dimensions,which not only adapts the cluster number but also avoids the cross-dimension learning error.The experimental results show that our algorithm outperforms the state-of-the-art algorithms on most real and synthesis datasets.It is able to identify the correct number of clusters and obtain a good cluster validation value.Through the sensitivity analysis,we validate the parameter settings,the design of genetic operators and the handle of empty cluster.The results prove the effectiveness and robustness of our proposed algorithm.

Keywords/Search Tags:

clustering, evolutionary computation, variable length encoding scheme, subspace

PDF Full Text Request

Related items

1	A Research Of Genetic K-Means Algorithm Based On Variable Length Encoding
2	Information Core Optimization Based On Evolutionary Algorithm And Clustering In Recommender Systems
3	Research On Two-Dimension XML Encoding Method Based On Variable Length Binary Code
4	Research On High Dimensional Data Clustering Based On Improved Evolutionary Algorithm
5	Research Of Function Clustering And Evolutionary Computation Knowledge Acquisition
6	The Study Of Error Correct Output Codes Algorithm Based On Soft Codes And Variable Length Codes
7	Research And Implementation Of Lightweight Variable Length Block Encryption Scheme In Internet Of Things
8	Research Of Mind Evolutionary Computation Multi-modal Optimization Performance And Of Mind Evolutionary Computation Parameters Effecting Efficiency
9	Research And Implementation Of A High Security DSSS Scheme Based On Variable-length PN
10	XML Encoding Scheme Of Supporting Updating Data Efficiently