Efficient Neural Architecture Search By Network Transformation

Posted on:2020-02-27

Degree:Master

Type:Thesis

Country:China

Candidate:H Cai

Full Text:PDF

GTID:2428330620459990

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Designing effective neural network architectures is crucial for the performance of deep learning.Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results.However,the computation demand is prohibitive with these methods(e.g.,10~4GPU hours),making them difficult to be widely used.A notable limitation is that they still design and train each network from scratch during the exploration of the architecture space,which is highly inefficient.In this paper,we propose a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights.We employ a reinforcement learning agent as the meta-controller,whose action is to grow the network depth or layer width with function-preserving transformations.As such,the previously validated networks can be reused for further exploration,thus saves a large amount of computational cost.Moreover,to further improve the performance,we propose path-level transformation operations to address the limitation of current network transformation operations that can only perform layer-level architecture modifications,such as adding(pruning)filters or inserting(removing)a layer,which fails to change the topology of connection paths.Based on the proposed path-level transformation operations,we further explored a tree-structured architecture space,a generalized version of current multi-branch architectures,that can embed plentiful paths within each CNN cell,with a bidirectional tree-structured RL meta-controller.We apply our method to design neural network architectures for the image classification task with restricted computational resources.Our method can design highly competitive networks compared to both human-designed and automatically designed architectures.On CIFAR-10,our model without skip-connections reaches 4.23%test error rate,exceeding a vast majority of modern architectures.Furthermore,by combining our method with best-human designed architectures,we can achieve 2.30%test error rate with 14.3M parameters on CIFAR-10 and 74.6%top-1accuracy on ImageNet in the mobile setting,demonstrating the effectiveness and transferability of our designed architectures.

Keywords/Search Tags:

Neural Architecture Search, AutoML, Network Transformation

PDF Full Text Request

Related items

1	Research Of Optimizing Neural Architecture Search In Automated Machine Learning Systems
2	Neural Architecture Search Based On Evolutionary Multi-Objective Optimization
3	Research On Neural Architecture Search Based On Gradient Optimization
4	Research On Adaptive Method Of Sequence Feature Extraction Based On Neural Architecture Search
5	Mold ID Recognition Method Based On Differential Neural Network Architecture Search
6	Evolutionary Computation For Deep Neural Network Architecture Search
7	Neural Architecture Design And Training Method For Efficient Deep Neural Networks
8	Neural Architecture Search Based On Evolutionary Algorithms
9	Platform-aware Efficient Convolutional Neural Network Architecture Search
10	An Architecture Search Algorithm To Accelerate Transformer On Hardware