Font Size: a A A

The Research Of Locally Linear Embedding Algorithm Based On Hadoop Platform

Posted on:2012-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2218330335495422Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the knowledge discovery process, a lot of information being stored in multi-dimensional matrix. These matrices contain not only data, and contains a lot of redundant information. This information will add to the burden of computing platforms and impact of computing speed. T and will affect the basic structure of the extracted data. The redundant information making the classification accuracy rate poor and speed of classification slow. To solve this situation, data reduction techniques emerged. It can make high-dimensional data showing their true structure, greatly reduce the burden of computing platforms and algorithms and the cost of computing the classification to improve operation efficiency. Locally Linear Embedding algorithm (LLE) data sets in for many with outstanding performance. It needs a lot of computing, above of between each sample to other samples of the distance,constructed weight function, to reconstruct the entire data set and reconstructed by solving the eigenvalues and eigenvectors to compute low-dimensional data mapping. Hadoop platform is good at large-scale data's parallel processing and distributed computing.It can solve "dimension disaster " caused by the large data can not be calculated under the machine crash problem.With the the development of application of data mining biotechnology, electronic commerce, psychology, geology and heavy industry fields is more and more use. Massively parallel processing, distributed data storage, cloud platform technology has become increasingly sophisticated solution "dimension disaster " the key technology, in economic and social development widely used.Hadoop platform is easy to use batch processing of large-scale distributed open source implementation of the structure and outstanding, are widely used to solve large computational problems.This paper completed the Hadoop platform LLE manifold efficient implementation of the algorithm, through a three-step LLE parallel algorithm improvements and design, then in the Hadoop platform to achieve the LLE algorithm" map reduce" programming model making the LLE algorithm Can effectively run on Hadoop platform to accommodate the massive data and a variety of practical applications and effective data mining and analysis and finally verified by testing the Hadoop platform can really reduce the computation time of high dimensional data.
Keywords/Search Tags:Local linear embedding, Dimension reduction, Hadoop platform, Parallel
PDF Full Text Request
Related items