Study On Incomplete Data Clustering Method Based On Correlation Of Sample Neighbors

Posted on:2020-08-12

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Cao

Full Text:PDF

GTID:2428330590997014

Subject:Detection Technology and Automation

Abstract/Summary:

PDF Full Text Request

Fuzzy C-means clustering has been widely used in the fields of pattern recognition and image processing.In real life,due to some reasons such as data omissions and data acquisition restrictions,the data sets obtained usually contain a large amount of incomplete data.However,the traditional clustering method can not be directly applied to datasets with incomplete data.The treatment of missing attributes also directly affects the clustering results.Therefore,from the perspective of neighbors correlation of samples,this paper proposes two incomplete data clustering methods.The main research contents include:Aiming at the defect that the basic fuzzy C clustering algorithm has equal division trend on the data set,based on the mutual influence value between the samples and the class proportion of the sample neighbors,a spatial distance based on the neighboring sample generics is proposed.The generic information of the neighbor samples around the sample points is introduced into the original Euclidean distance in a proportional manner,and the sample distribution information is used to achieve the purpose of making the distance measurement process adjust according to the data set change,based on the distance between sample points,a clustering effect value is constructed to introduce clustering objective function.An incomplete data fuzzy C-means clustering method based on sample space distance is proposed.The experimental results show that the proposed algorithm considers the spatial distribution characteristics of the sample in the distance calculation to obtain more accurate clustering results of incomplete data.Based on the mutual influence value between samples,an incomplete data clustering method based on sample neighbor membership weighting is proposed.The weighted membership degree of the sample neighbors is used to correct the membership degree of the sample itself,so that the membership of the sample itself is adjusted by the weighted average of the membership of its neighbor samples.In order to make full use of the distribution information of the sample,the weighting coefficient used is the Gaussian kernel function in the similarity function,so that the sample distribution in the neighborhood of the sample points can affect the similarity between the sample points to improve the clustering effect of the incomplete data set.

Keywords/Search Tags:

Neighbors Correlation, Incomplete Data, Fuzzy C-means, Clustering

PDF Full Text Request

Related items

1	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Improved BP Imputation
2	Research Of Fuzzy Clustering Algorithm For Optimizing Incomplete Data Based On Extreme Learning Machine
3	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Interval Analysis
4	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Improved VAEGAN
5	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Information Feedback Rbf Network Valuation
6	Incomplete Data Fuzzy Clustering Methods Based On Consistency
7	Research On Incomplete Data FCM Clustering And Outlier Detection
8	Research Of Hybrid Clustering Algorithm For Incomplete Data Based On Local Weighting
9	Research On Clustering Algorithms For Incomplete Data
10	Clustering Incomplete Data Using Pseudo Nearest Neighbor And Interval-valued Distance