Font Size: a A A

Clustering Research Based On Cloud Model And Data Field

Posted on:2018-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z FengFull Text:PDF
GTID:2348330518453378Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the "Internet plus" in the popularization and development of the industry,big data thinking is gradually affecting people’s way of thinking,leaders,according to the data for decision making has become the norm.However,the current data has a large amount of data,various types,quick change,low value density four characteristics,in order to extract valuable information,you need to analyze large data,and big data analysis is the core task of clustering.Clustering is based on the structure of the data itself,which can be divided into several classes,so that the similarity between each class is high,and the similarity between different classes is low.The classical clustering algorithms such as k-means,DBSCAN,DPC and other clustering algorithms have shortcomings,or noise-sensitive,or require the users to input parameters.In order to overcome the disadvantages of the above algorithm and can inherit the advantages of each algorithm,based on the traditional clustering algorithm,a new adaptive clustering algorithm is proposed by introducing the cloud model and data field.This algorithm can deal with arbitrary shape data set,noise resistance ability,able to handle high-dimensional data,and the whole clustering procedure does not require human input parameters.The main research contents are as follows:1,Improved the formula of potential function in the data field,the potential value beyond the range of 3 Sigma is defined as zero.The improved potential function can accurately identify the noise points and reduce the time complexity of the algorithm.2,In this paper,we give the two methods of selecting the parameters of the data field,and put forward the method of calculating the minimum entropy with the golden section method,automatically get the best parameter.3,According to the characteristics of clustering center in this paper,a method of automatic detection of the best clustering center is proposed.4,Based on improved data field,an adaptive clustering algorithm is presented,the entire cluster process without human intervention,solved the classical clustering algorithm based on human input parameter dependencies.This algorithm calculates the potential value of the object to find cluster centers and the noise,and then divided by the potential value of other objects in the nearest neighbor clustering,which through the clustering process.5,Through the experimental simulation of clustering in the classical data set,verify the algorithm for arbitrary shape cluster of distributed data sets.And compared with k-means,DBSCAN,DPC algorithms,the algorithm has better clustering effect.6,According to the high dimension data,a new feature extraction method based on cloud model is proposed.Three features of the data using the method of reverse cloud generator calculation of the original data,and then three digital feature as the main character,and in three high-dimensional data sets show that the method is effective.
Keywords/Search Tags:Cloud model, data field, clustering algorithm
PDF Full Text Request
Related items