Font Size: a A A

Distributed training of very large neural networks

Posted on:2014-04-18Degree:M.SType:Thesis
University:University of California, IrvineCandidate:Patel, Vishal RFull Text:PDF
GTID:2458390005499731Subject:Computer Science
Abstract/Summary:
We describe an implementation of deep learning algorithms on shared-nothing machine clusters using a distributed file system and the think-like-a-vertex programming model (the Pregel programming model). While there has been success in distributing very large neural networks across many machines in a cluster, these implementations have been limited in deployability and replication by proprietary technologies and unavailable cluster configurations. Our software, PArallel Neural Distributed Architecture (PANDA), is the first open source implementation that allows training of neural networks with millions or billions of parameters using commodity machine clusters. At its core, the neural network forward- and back-propagation are implemented in a parallel and distributed fashion using neuron centric views and message passing algorithms. This flexible and scalable approach allows for both data and model to be distributed across different machines in a cluster during training and prediction without requiring a centralized parameter server. This implementation uses the Hadoop Distributed File System and Pregelix, an open source implementation of Pregel.
Keywords/Search Tags:Distributed, Implementation, Neural, Training
Related items