Font Size: a A A

Research On Fusion Clustering Method Based On Data Feature Selection

Posted on:2022-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:M M LiFull Text:PDF
GTID:2518306605997979Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the information,how to explore effective features from massive data,it is important to achieve accurate clustering.Cluster analysis is an effective way to utilize massive data,but the performance of clustering methods mainly depends on accurate mining,fine metrics and optimization options for data characteristics.The most representative k-modes cluster method is currently the extension and improvement of the k-means clustering method.However,due to the lack of accurate mining of the features contained in the data,insufficient measurement of fine-grained measurements,and the need to optimize feature selection,this method often exhibits poor results in practical applications.Therefore,this thesis aims at the above-mentioned problems existing in the k-modes clustering methods,focusing on key technologies such as accurate feature mining,feature fine measurement and feature optimization selection,and carry out research on fusion clustering methods based on data feature selection.The main innovations and research contents are as follows:(1)A fusion clustering method based on maximum information coefficient metric is proposed.Firstly,the MIC value of the attribute is obtained by introducing the MIC measurement method;secondly,it is fused with the original measurement method to obtain a new measurement mode;a more refined k-modes clustering method is established again;and finally,the efficiency of the new method is simulated and tested by using the standard UCI data set,which effectively improves the clustering accuracy.(2)A fusion clustering method based on optimizing Relief F feature selection mechanism is proposed.Firstly,the optimized Renyi entropy mutual information measurement is added to the Relif F feature selection to establish a feature screening method to eliminate the redundant information in the feature subset;secondly,the obtained subset of the optimized features is used as input,combined with the improved k-modes method,the optimized MIC-kmodes fusion clustering method is established;finally,the accuracy and efficiency of the new method are verified by experimental simulation.(3)A fusion clustering method based on optimizing auto encoder feature processing is proposed.Firstly,the L2 norm constraint to prevent over-fitting is introduced in the auto encoder;secondly,the filter method for adaptive solution of auto encoder parameters is established and extended training is carried out under the federated learning framework;thirdly,the encoded vector is input into the clustering model in(1),and the MIC-kmodes clustering method optimized in the encoding space is established;and finally,the effectiveness of the method is verified by simulation.
Keywords/Search Tags:Cluster analysis, Feature selection, k-modes, ReliefF, Auto Encoder
PDF Full Text Request
Related items