Font Size: a A A

Research And Application Of Clustering Algorithm Based On Feature Point Selection

Posted on:2011-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:G H ZhuFull Text:PDF
GTID:2178360305951568Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With an explosive increase in global information, data mining technique has been a focus of the new century computer science and technology research. Cluster analysis in one of the most important fonctions in data mining.Clustering is the process of grouping a set of physical objects or objects into classes or clusters, in which similar objects are grouped in the same cluster while different objects are in different clusters. Clustering processes are always carried out in the condition without pre-known knowledge, so the main task is to solve that how to get the clustering result in this premise. Up to present, many clustering algorithms have been presented, but these algorithms are only suited special problems and users. Furthermore, they are imperfect both theoretically and methodologically, even severe fault. The K-means algorithm has the extremely important application value in Dam Mining, but with the application development and the new question demand, K-means limitations become increasingly prominent. The number of clusters in applications are usually based on the user assumes. But users often do not set the exact number of clusters. The number of clusters once have be established, in the whole clustering process can not be changed, the final clusters number is the initial number of clusters. And select different initial core nodes of the data also will affect the effectiveness of clustering algorithm, so the user generally will not get an accurate clustering. These two important shortcomings serious impact K-means algorithm's application scope in clustering algorithms.This dissertation systematically, deeply, roundly and detailedly studies and analyses the technique and methods of clustering analysis, puts forward an improved Clustering algorithm based on Feature Point Selection(CFPS), considering the fault of K-means clustering algorithm. The CFPS algorithm also belongs to the database segmentation category.CFPS algorithm use a fitness function during clustering, CFPS algorithm according to the distance of clusters and the fitness function of the points to clustering and adjust parameter k of clusters, this algorithm don't need select the initial core nodes of the data, at the beginning each object belongs to a cluster, so the result of clustering is stable, CFPS algorithm does not fall into local optimum clustering result. Experimental results show that the CFPS clustering algorithm in data mining, compared with other clustering algorithms, CFPS algorithm improves the clustering accuracy and efficiency. So users can easily use the algorithm proposed in this paper without configure complex parameters, and can get better or the same as the results of other clusterig algorithm.Cluster analysis and related technologies in Intrusion Detection Intrusion Detection is currently a hot topic, this dissertation attempts to use CFPS clustering algorithm in intrusion detection systems, and use the KDD CUP 1999 data set as the experimental data, the K-means algorithm and CFPS algorithm have be tested, algorithm analysis and experimental results show that the CFPS algorithm has better detection performance, get a higher detection rate and low false alarm rate, the method can overcome the traditional K-means algorithm needs to man-made determine the k value and by the initial clustering center of choice implications.
Keywords/Search Tags:Data Mining, Clustering Analysis, K-means, Intrusion Detection
PDF Full Text Request
Related items