| In actual data clustering,many factors,such as measurement error,incorrect understanding data,and so on,can cause data loss.Proper processing of incomplete data is essential for further cluster analysis of the entire data set.Therefore,the research on incomplete data clustering is of great significance and application value,and has gained wide attention from scholars both at home and abroad.Firstly,in view of the problem that the fuzzy c-means clustering algorithm cannot be directly used in incomplete data fuzzy clustering,a fuzzy clustering algorithm for incomplete data based on information feedback RBF network valuation(IFRBF-FCM)is proposed in this paper.Through the analysis and research of RBF neural network,in order to make the RBF neural network can get more information,so that it can better realize the valuation of missing attribute.This paper combines the thought of kalman filter.The difference between the predicted value of RBF neural network and the theoretical expectation of the data is fed back to the input layer to obtain the information feedback RBF network(IFRBF)model.At the same time,this paper uses the nearest neighbor rule to select the corresponding training sample set for the incomplete data samples,and And then uses the nearest neighbor training sample set to train the corresponding IFRBF network for each missing attribute,so as to realize the estimation of missing attribute.In this way,the complete data set of IFRBF network valuation is obtained,and fuzzy cluster analysis is made for the complete data set after recovery.Secondly,due to the estimated value of incomplete data obtained after IFRBF network valuation in this paper is numeric.But the numerical data is not suitable for describing the uncertainty of incomplete data attributes and there is a certain error.So,this paper proposes an incomplete data fuzzy clustering algorithm based on IFRBF interval estimation(IFRBF-IFCM).When using IFRBF network to estimate the missing attribute,it can also obtain the estimation error of the complete attribute inthe data.Use the obtained mean of the absolute value of the error to determine the left and right endpoint value of the missing attribute valuation interval,and then the numerical value of the missing attribute is converted into the form of the interval.At the same time,it is necessary to transform the complete attribute of the data set into the form of the interval.Then,the interval fuzzy c-means clustering algorithm is used to cluster analysis and obtain clustering results.Finally,this paper uses the three data sets of the UCI machine database,which is Iris,Bupa,and Breast,and two artificial data sets to carry out simulation experiments.The experimental results show that the accuracy of the clustering results obtained by using the IFRBF network in the estimation of incomplete data sets is improved compared with the comparison method.And the result of clustering based on interval estimation is more accurate than the clustering result of numerical value estimation,and the robustness is also better. |