Font Size: a A A

The Research Of Ant-Based Clustering Algorithm For Data Sets With Mixed Attribute

Posted on:2007-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:W L ZhaoFull Text:PDF
GTID:2178360185477527Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The purpose of Data Mining is to abstract potential, valuable knowledge and useful information from plentiful data, cluster analysis is one of the research domains of data mining. It has important appliances in many domains such as in business, biology, medicine, geography, web archive, and it is one of the hot research problems.This paper has studied cluster analysis methods clearly, and done the follow work:1. We apply ant-based clustering algorithm to data sets with mixed attribute, put forward an improved ant clustering algorithm (ILF algorithm) based on standard ant clustering algorithm (IF algorithm). In this algorithm, by introducing many strategies such as formula improvement, radius increase, short-term memory, space partition etc., the efficiency and the clustering performance are both improved in a certain degree. The ant-based clustering algorithm in this paper avail adaptive theory, in a degree, could expedite evolutionary process, what is better a sort of essentially distribution co-ordinate algorithm, hence it is more efficient and is quite feasible for data sets with mixed attribute. At the same time, a new distance measure function is adopted to combine numeric and categorical values together, and cluster analysis of mixed data sets is carried out. Through the UCI database test, the simulation experiment result illustrates that the ant-based new algorithms run faster and have stronger robustness. They are quite feasible for data sets with mixed numeric and categorical values and the results are satisfying.2. We also improve the ant clustering algorithm based on information entropy (EAC algorithm) and set forward IEAC algorithm. It amends picking and dropping rules in LF algorithm through computing and comparing information entropy to reduce the number of parameters. By introducing many strategies such as radius increase, short-term memory, forced drop, etc., the performance has been improved. This algorithm is quite feasible for data sets with mixed attribute especially for categorical values.
Keywords/Search Tags:data mining, clustering analysis, numeric data, categorical data, ant colony algorithm, information entropy
PDF Full Text Request
Related items