Study On Fuzzy Clustering For Incomplete Data Based On Probability Model Of Missing Attribute Values

Posted on:2018-09-09

Degree:Master

Type:Thesis

Country:China

Candidate:G X Liu

Full Text:PDF

GTID:2348330536961576

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

Fuzzy clustering has been widely used in image processing,pattern recognition and other fields.Traditional clustering analysis methods can only be applied to complete data sets and can not be applied directly to incomplete data sets.However,in practical applications,the data are usually incomplete due to various reasons,and the processing of missing attributes has a significant effect on the clustering effect.Therefore,it is a practical problem to study the clustering method of incomplete data set.Neighbor interval can take into account the uncertainty of the missing attribute value,but it does not fully exploit the attribute value of the neighboring sample,and can not reflect the attribute value distribution information of the neighbor sample.Based on the description of the neighborhood of the missing attribute value,The probability attribute is the distribution information in the range of the nearest neighbor,and a simple and effective probability model(PM)is established for the missing attribute value.Genetic algorithm is used to achieve clustering by genetic algorithm and gradient descent method.The genetic algorithm performs the initial population and mutation operation by the probability value.The gradient descent method determines the search step by the probability of missing attribute value.The algorithm searches for the missing attribute estimates based on the probabilistic search of the missing attribute estimates in the corresponding nearest neighbor interval to minimize the clustering objective function.FCM clustering can be achieved by reducing the data set based on the optimized missing attribute estimates.The incomplete data fuzzy poly class problem.In this paper,the missing attribute value probability model can not only introduce the nearest neighbor information into the missing attribute description,but also fully exploit the distribution information of the corresponding attribute value in the nearest neighbor range,so it can effectively "reduce" the missing attribute value.Genetic algorithm has a fine global search ability,and the stability is better;and gradient descent method has the ability to quickly search,can quickly search for a better solution,you can get a good clustering results.Simulation experiments on multiple UCI data sets show that the probability model is an effective method to describe the missing attribute values of the incomplete data,and the result of clustering is stable.

Keywords/Search Tags:

Probability Model, Fuzzy c-means, Genetic Algorithm, Gradient Descent

PDF Full Text Request

Related items

1	The Reseach And Application Of Stochastic Gradient Descent And Dual Coordinate Descent Algorithm
2	Research On Adaptive Peak Matching Algorithm Of Vibrational Spectrum Based On Genetic Algorithm And Conjugate Gradient Descent
3	The Fuzzy Neural Networks Based On SAM Systems
4	The Research Of Distance Metric And Model Selection In K-Means Clustering And L2-SVM Classification
5	A Research Of Stochastic Gradient Descent Algorithm
6	A Study On Gradient Descent Regularized Orthogonal Matching Pursuit
7	A Domestic Soybean Price Forecasting Model Based On Improved Quantile-RBF Neural Network
8	Modeling Of Nonlinear System Based On T-S Fuzzy Model
9	Dynamic Regret Of Online Gradient Descent:Analyses And Applications
10	Imbalanced Stochastic Gradient Descent Online Algorithm For SVM