Font Size: a A A

Multi-Manifold Learning Algorithm For Multi-Source Data Aggregation

Posted on:2019-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:P ZouFull Text:PDF
GTID:2428330545951218Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the Internet,people can obtain data from multiple sources,that is,multi-source data.Since multi-source data are of diverse types and different scales,how to gather and extract effective information from multi-source data is a hot topic in machine learning and pattern recognition.Multi-manifold learning can effectively reveal the internal structure of complex data.Therefore,this thesis mainly applies the multi-manifold model to study the boundary detection problem,robustness and adaptivity problem in multi-source data aggregation.The main work of this paper includes the following three aspects:This thesis proposes a multi-manifold learning algorithm based on boundary detection(MBD).The algorithm distinguishes multiple manifolds by detecting the boundary points of the manifolds.And then the overall geometric structure of the data is maintained by using the boundary points and the farthest points between the manifolds.Experiments on synthetic datasets and real datasets show that the MBD algorithm has better recognition results in data with well separated manifolds.This thesis proposes a multi-source robust spectral multi-manifold clustering algorithm(MRSMMC).The algorithm first purifies the original multi-source data through a noise reduction projection matrix.Then,the mixed principal component analysis model is used to divide the interconnected multi-manifold into several disjoint blocks.Thus,the similarity matrix of each source is constructed based on the local neighbor tangent space of data points.Finally,a comprehensive similarity matrix can be obtained by converging the similarity matrix of each source to realize the recognition of multi-source data.Experiments on single source datasets and multi-source datasets show that the algorithm achieves better recognition and robustness than other algorithms.This thesis proposes a nonnegative and adaptive multi-source clustering algorithm(NAMC).Traditional multi-source data learning algorithms learn the weight of each source by introducing a hyperparameter.NAMC considers each source as a manifold,and the nearest neighbor matrix and the weight of each source are updated in an adaptive way so that the data manifold structure is more accurate.Finally,the adjacent matrix of each source is converged to get a consistent adaptive similarity matrix.Comprehensive experiments on several real-world data sets show the effectiveness of proposed approach,and demonstrate the advantage over other state-of-the-art methods.
Keywords/Search Tags:multi-source data aggregation, multi-manifold learning, boundary detection, robustness, adaptivity
PDF Full Text Request
Related items