Font Size: a A A

Parallelization Research On Families Of Gradient Descent And Expectation Maximization Iterative Algorithms

Posted on:2013-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:A B LuoFull Text:PDF
GTID:2268330431962043Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Data explosion has been an important problem in machine learning and data min-ing. The general data mining algorithms may no longer apply to massive data mining and designing parallelized algorithms are needed. For the iterative algorithms, there are more challenges because they are not only compute-intensive, but also data-intensive. It is significant to study effective and reasonable parallel algorithms.MapReduce is one of the popular parallel technologies. Combined with the Hadoop parallel computing, it can enable researchers to focus more energy on designing paral-lel algorithms themselves.This paper aims to study the parallelization of iterative algorithms for massive data mining, employing Hadoop and MapReduce. Main contributions of our work can be summarized as follows:First, research and analysis for parallel data mining algorithms are given and the evaluations for parallelized algorithms are presented.Second, an gradient-based parallel algorithm named P-Pegasos are proposed, which is an parallel solver for SVMs. The idea origins from parallelized stochastic gradient descent. The experiments show that it is fast and effective and it can achieve accura-cy of the traditional non-parallel methods. It also illustrates that the parallelization of gradient-family algorithms in this way can achieve good parallel performance.Third, a parallelized Kmeans algorithm called MR-P-Kmeans, which is a variant of Expectation Maximization, is presented. It can deal with the clustering problem for massive data. The experiments show good performance in time and scalability. We also illustrate how to parallel the EM-family algorithms indirectly.
Keywords/Search Tags:Massive Data, MapReduce, Iterative Algorithms, Parallel
PDF Full Text Request
Related items