Research And Application Of Improved Fuzzy C-Means Clustering Algorithm Based On Weka Platform

Posted on:2014-11-30

Degree:Master

Type:Thesis

Country:China

Candidate:W J Zheng

Full Text:PDF

GTID:2268330401977118

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Data mining is a method to obtain useful information and knowledge resources from large amounts of data resources. The clustering algorithm is widely used and studied in Data mining algorithms. The fuzzy C-means clustering algorithm using fuzzy theory is classified according to the degree of membership instance belongs to which category, treat clustering data analysis more objective.The paper analyzed the fuzzy C-means clustering algorithm, the fuzzy C-means clustering algorithm is simple and has good performance, but more sensitive to initial values, easy to make the algorithm fall into the local minimum and not global optimum, not only the number of iterations increases, but eventually easily cause the failure of clustering. In view of the requirement of the fuzzy C-means clustering algorithm, the paper proposes a fuzzy C-means clustering algorithm based on density of instances, the clustering center is closer to the actual class center, reduces the number of iterations, and improves the clustering effect. Through experiments on simulated data sets and UCI data sets, which show that the improved algorithm is effective.The rich system function, easy to operate, and the Java-based open source data mining tool Weka attract the attention of the data mining researchers. But the Weka is weak in terms of clustering, so the article studied the Wekaâ€™s development environment structure, interface specification, specific methods to add new algorithms and implementation steps. And it achieves one hierarchical clustering algorithm SmipleChameleon algorithmã€fuzzy C-means clustering algorithm and the improved fuzzy C-means clustering algorithm.In order to further verify the effectiveness of the improved algorithm, the improved algorithm is applied to social insurance audit data. Through the analysis of social insurance audit data, according to the data has the features of a large amount of data, many of payment type and have redundant data. The paper does some pretreatment on social security audit data, such as data consolidation and attributes selection. And then according to the four clustering objective of each area, the paper does some comparative experiments using the traditional fuzzy C-means clustering algorithm and the improved algorithm Through analysis on the experimental results, the improved algorithm reduces the number of iterations at the same time improves the clustering effect, which verify the effectiveness of the improved algorithm again.

Keywords/Search Tags:

Weka platform, fuzzy C-means clustering algorithm, density ofinstance, social insurance audit

PDF Full Text Request

Related items

1	Research On Fuzzy Clustering Algorithm Under The Weka Platform
2	Research And Application Of Remote Sensing Image Clustering Based On The Improved Fuzzy C-means Algorithm
3	Research On Intelligent Medical Insurance Audit System Based On BP Neural Network And Association Rules
4	The Application Of Improved Fuzzy C Means Clustering Algorithm In Image Segmentation
5	Research On Fuzzy Clustering Analysis Algorithm Based On Density
6	Fuzzy C-means And K-means Clustering Algorithm And Its Parallel
7	High Dimensional Fuzzy C-Means Clustering Recommendation Algorithm Based On Density Canopy
8	The Application Of Fuzzy C-means Clustering In The Stock Investment
9	Research On Blocking Fuzzy Clustering Algorithm Based On Density Of Samples
10	Density Peak Clustering Algorithm Based On Adaptive Cluster Center