Font Size: a A A

Efficient Data Representation Combining With ELM And NMF

Posted on:2015-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y CengFull Text:PDF
GTID:2268330428464521Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,a large number ofhigh-dimensional data appear constantly.High-dimensional data significantlyincreasedthe cost of the calculation and storage.It has become challenges,such as datadisaster,for machine learning and pattern recognition.Dimension reduction caneffectively avoid the dimension disaster.It has become the hot issues in the field of theimage retrieval,pattern recognition,machine learning and so on.The nonnegativeMatrix decomposition (Non-negative Matrix Factorization, NMF),as a powerful datadimension reduction method in machine learning (such as classification,clusteringproblem),is widely used.In the face of high-dimensional data, unconstrained NMF need to spend amountofcalculation.Its service speed is slow.For this point,NMF combined with ExtremeLearning Machine (ELM) feature mapping method (EFM NMF),proposed by QingHe,can effectively reduce the computational of the NMF.However,ELM featuremapping,which using random parameter,is a nonlinear feature mapping technique.ELM feature mapping will influence the subspace generate from NMF withoutsufficiently constrains.Aiming to solveproblem of reduction generalization performance of data usingEFM NMF,this paper proposes an improved EFM NMF method named EFM GNMF.By combining ELM feature mapping and regularization Nonnegative Matrixdecomposition (Graph Regularized Nonnegative Matrix Factorization, GNMF)method,EFM GNMF can effectively reduce the calculation time of NMF,withoutreducing the NMF generated subspace of data representation ability.Undering the current big data encironment,Hadoop as open source project,is themost popular cloud computing platform,which is based of two core technologiesHDFS and Mapreduce.Because the huge amounts of data can not be completedstorage and analysis in a single node.In this paper,we analyzes the Hadoop distributedplatforms firstly.Then,we analyzed and implement parallel EFM GNMF algorithmunderMapreduce programming framework.This article analyzes two points of theparallel EFM GNMF:(1) Multiplicationof matrix under the framework of Mapreduce.We will introduced several matrix multiplication method in our paper;(2)Calculations of K neighbor matrix Graphs in GNMF under the frameworkofMapreduce.The calculation of general K neighbor matrix under the framework ofMapreduce is time-comsuming.This article will introduces a method to solve thepromble,which construct the approximate K neighbor matrix K neighbor graph,to gaintime.In this article,we will use COIL20gallery,CMU PIE face database and TDT2textdatabase to analyze the effectiveness of the EFM GNMF under the Matlabenvironment.We also testthe efficiency of EFM GNMF under the Hadoopenvironment.
Keywords/Search Tags:Non-negative Matrix Factorization(NMF), Extreme LearningMachine(ELM), Feature Mapping, Graph Regularized Nonnegative MatrixFactorization(GNMF), Dimension Reduction, Hadoop Distributed, MapreducePrograming Frame, Parallelization
PDF Full Text Request
Related items