Research Of Data Competition Algorithm Based On Aggregation Field Model

Posted on:2014-03-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Q Zhang

Full Text:PDF

GTID:1268330425967049

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Cluster analysis, which stems from taxonomy, is a method of statistical analysis forexploring the internal structure of unknown data. Its importance and intersectionality amongother research direction are confirmed consistently by many researchers. Clustering is toorganize a dataset into meaningful or useful groups (clusters), and is an important researchcontent in some areas such as data mining and pattern recognition. Recently, it has beensuccessfully applied to image segmentation, text clustering, computer vision, speechrecognition, character recognition, data compression and information retrieval. In addition,cluster can also be applied to the disciplines such as the multi-relationship data mining,time-spatial database application, sequence and heterogeneous data analysis, bioinformaticsand marketing. The stability of the existing partitional clustering algorithms is seriouslyrestricted due to their sensitive to noise and outliers. Moreover, the clustering algorithmrequires higher clustering quality for the increasingly complicated internal structure of thedataset. This thesis studies the existing key problems of partitional clustering. And the mainwork is the following:(1)Clustering analysis is a process of exploring internal structure of the dataset in areasonable model framework. However, the several existing models can not well describepartitional clustering problems. First, a novel Aggregation Field Model is proposed and thefeatures of different data objects are defined in this thesis. According to them, severalstrategies of denoising and dealing with outliers are designed in this thesis. Next, an improvedK-Means algorithm based on aggregation energy, AEKMA, is designed in this thesis. It canprovide better initial centers for K-Means algorithm. The experimental results show thatAEKMA can well initialize the KM algorithm and its performance is better than that of theK-Means algorithm.(2)Under further studying the principle of aggregation field model, a novel datacompetition based partitional clustering algorithm, DCA, is designed in this thesis. DCAregards all data objects as potential representative points, finds the suitable representativepoints using which to complete the process of clustering. The experimental results show thatperformance of DCA is superior and can restrict the interference resulting from outliers, andthat the DCA is stable with obviously superior than some other partitional clustering algorithms and is an effective way to solve clustering problem.(3)The thesis find that the DCA can not obtain ideal results when it is directly applied todocument clustering after further studying the features of the DCA. The reason lies in thecomplicated and the high-dimensional sparse structure of text data set, and the existence ofdimension disaster phenomenon. Therefore, it is a novel way of solving text clustering tooptimize and improve the internal structure of the document dataset. Fortunately, spectralclustering ensemble algorithm can provide simple input for DCA because the essential of thealgorithm is to map high dimensional data to low dimensional one resulting in the obtainedsimple low dimensional embedding of original data. Then, a data competition based textclustering ensemble spectral algorithm, DCCESA, is designed in this thesis. The experimentalresults show that DCCESA can obtain better clustering results than those of the commonlyused clustering ensemble algorithms, and DCCESA is an effective method to solve theproblem of document clustering ensemble for its high clustering quality and efficiency.(4)This thesis further studies the probability of applying DCA to the field of imagesegmentation. As the time complexity of DCA is proportional to O(n2), it is unsuitable forlarge image processing. However, the experiments verify that the DCA can obtain bettersegmentation effect in small images though it can not partition large images under thecondition of certain hardware and software. In order to apply DCA to large image processing,an image segmentation using the Mean Shift algorithm and the DCCESA algorithm,MS-DCCESA, is designed in this thesis. MS-DCCESA pre-segments the large image by usingthe Mean Shift algorithm and introduces the thought of spectral clustering ensemble toperform the pre-segmented regions resulting in the good input of the DCA algorithm. Theexperimental results show that MS-DCCESA can obtain better segmentation quality than thatof some other commonly used algorithms and MS-DCCESA is effective.

Keywords/Search Tags:

aggregation field model, K-Means algorithm, data competition, document clustering, image segmentation

PDF Full Text Request

Related items

1	The Research And Application Of Improved Data Competition Clustering Algorithm
2	Research On Robust Image Segmentation Algorithm Based On Neutrosophic Clustering
3	Investigation Of Clustering Algorithm Based On Spatial Domain On Image Segmentation
4	Research On Robust Fuzzy Clustering Segmentation Algorithm With High Performance
5	Research On Technology Of Image Segmentation And Its Application
6	Research Of Image Segmentation Algorithm Based On Clustering Analysis
7	Research Of Algorithm For Image Segmentation Based On The C-means Clustering
8	Research And Comparison Of Several Kinds Of Clustering Algorithm For Image Segmentation
9	Research On Segmentation Algorithm Based On Neutrosophic C-means Clustering
10	Research On Brain MR Image Segmentation Based On FCM Algorithm