Font Size: a A A

Cluster Boundary Point Detection Algorithm

Posted on:2008-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:F YueFull Text:PDF
GTID:2208360215960845Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The process of discovering interesting, useful and previously unKnown Knowledge from very large database is Known as data mining. Data mining, also Known as Knowledge discovery in database(KDD), is one of the most active fields in database.Data mining aims to discover many trustful, novel, useful and readable Knowledge, rules or abstract information from very large database. This plays a new significant role to the stored data in the info-times. With the rapid development of the data mining techniques, clustering analysis and outlier detection, as important parts of data mining, are widely applied to the fields such as pattern recognition, data analysis,image processing,and marKet research.Research on clustering analysis and outlier detection algorithms has become a highly active topic in the data mining research. Sometimes detecting border of clusters is more important than clustering and outliers analysis, while it hasn't received so much attention as that of clustering and outliers analysis. So this paper mainly do some research on the algorithms ofdetecing border of clusters.In this thesis, the author introduces the theory of data mining, and deeply analyzes the algorithms of clustering analysis and outliers detection. We introduce a classical algorithm of detecting border of clusters named BORDER in details, and analyze the advantanges and disadvantages of BORDER based on experiments and theory. Aiming to the disadvantages that BORDER has low efficiency and detecting precision, we develope three different methods of detecting border of clusters: (1) Detecting Boundary Points of Clusters in Noisy Dataset (BOUND), An Efficient Boundary Points Detecting Algorithm (BRIM) and Gravity-based Boundary Points Detecting Algorithm (GREEN). We also present a new outliers detecting algorithm maKing use of objects' character of reverse K neighbours.We conducted a series of experiments on synthetic datasets and the real datasets to verify the correctness and efficiency of algorithms. In order to verify the efficiency of algorithms of detecing border of clusters, we have conducted a series of experiments on synthetic datasets with different size. As shown in the experimental results, the three newly proposed algorithms of detecing border of clusters (BOUND, BRIM and GREEN) has higer precision and efficiency than that of BORDER; The proposed outliers detecting algorithm has higer efficiency than that of LOF and LSC.
Keywords/Search Tags:data mining, cluster algorithms, outliers detection, detecting border of clusters
PDF Full Text Request
Related items