Font Size: a A A

Research On Multi-view K-means Clustering Algorithm On Large Data

Posted on:2018-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y R GuoFull Text:PDF
GTID:2348330515969913Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In past decades,more and more data are collected from multiple sources or represented by multiple views,where different views describe distinct perspectives data.The traditional clustering algorithms have shown their weakness in dealing with some clustering analysis problems.Therefore,multi-view clustering algorithm is proposed.The existing multi-view clustering algorithms can be divided into three types,which are co-training clustering algorithm,the clustering algorithm based on multi-core and the multi-view clustering algorithm based on the subspace.However,with the explosive rising of data,more and more large-scale data with multiple views emerge,which need to be mined and processed.There are four existing methods to deal with large-scale data: the method based on sampling,the method based on selection of the clustering character,the semi-supervised clustering algorithm based on constraint information and the clustering algorithm based on distribute platforms.All these methods are used for single view data,which could not be used to solve the problem of large-scale data with multiple view directly.To overcome this difficulty,related research is carried on in this thesis.The main content and innovation of this thesis are as follow:?1?It summarizes the existing multi-view clustering algorithms and single view clustering algorithms on large-scale data and states the principle and range of application,especially the short of the existing multi-view clustering algorithms in dealing with large data.?2?According the problem above,we propose a new clustering algorithm LKMC?multi-view K-means clustering algorithm on large data?.This algorithm uses the sparsity-including norm l1,2 to optimize the objective function.After partitioning data chunk by chunk,implement the multi-view cluster algorithm on each chunk and identify the central points and then implement the multi-view cluster algorithm on all the central points to obtain the final result.The algorithm is not sensitive to the different initialization and is able to deal with large multi-view data.
Keywords/Search Tags:large multi-view data, multi-view clustering, k-means, l1,2 norm, chunk
PDF Full Text Request
Related items