Font Size: a A A

Cloud-based Platform For The Large-scale Manifold Learning Algorithm Research

Posted on:2013-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y L BianFull Text:PDF
GTID:2218330371459827Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The rapid development of the internet technology, while brings a lot of information to us, but also increased the difficulty to find the required knowledge from the data. To solve this problem, data mining technology has been rapid developed, and now data mining technology has been widely used in finance, health care, internet, political, economic and social areas, and so on. With the exponential growth of the data size, how to find the usefull information from the large-scale data is a central important problem. In this thesis, we try to solve this problem by using the cloud computing platform such as Google and Hadoop.Google's Mapreduce programming model is based on mass data processing. The programmers only need to focus on the parallel computing tasks, and needn't consider about the data segmentation, task allocation, fault tolerance and other details. This greatly improves the programming efficiency.This paper we propose the L2 norm locality sensitive hashing method to find the approximate neighbor of the query data, one of the algorithm is based on LLE for feature extraction, and the other is based on LE for feature extraction.In the two algorithms we use the approximate neighbor which compute from the L2 norm locality sensitive hashing to compute the matrix W, and use Lanzcos method to compute the eigenvectors of the Laplacian matrix.Improve the performance of the algorithm with different parallel strategies.Through the experiments, we analysis the algorithm's performance on the Map reduce framework and the effects.
Keywords/Search Tags:cloud computing, data mining, manifold learning, laplacian eigenmap, locally linear embedding, Map reduce, locality sensitive hashing
PDF Full Text Request
Related items