Font Size: a A A

Efficient Neural Architecture Search By Network Transformation

Posted on:2020-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:H CaiFull Text:PDF
GTID:2428330620459990Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Designing effective neural network architectures is crucial for the performance of deep learning.Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results.However,the computation demand is prohibitive with these methods(e.g.,10~4GPU hours),making them difficult to be widely used.A notable limitation is that they still design and train each network from scratch during the exploration of the architecture space,which is highly inefficient.In this paper,we propose a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights.We employ a reinforcement learning agent as the meta-controller,whose action is to grow the network depth or layer width with function-preserving transformations.As such,the previously validated networks can be reused for further exploration,thus saves a large amount of computational cost.Moreover,to further improve the performance,we propose path-level transformation operations to address the limitation of current network transformation operations that can only perform layer-level architecture modifications,such as adding(pruning)filters or inserting(removing)a layer,which fails to change the topology of connection paths.Based on the proposed path-level transformation operations,we further explored a tree-structured architecture space,a generalized version of current multi-branch architectures,that can embed plentiful paths within each CNN cell,with a bidirectional tree-structured RL meta-controller.We apply our method to design neural network architectures for the image classification task with restricted computational resources.Our method can design highly competitive networks compared to both human-designed and automatically designed architectures.On CIFAR-10,our model without skip-connections reaches 4.23%test error rate,exceeding a vast majority of modern architectures.Furthermore,by combining our method with best-human designed architectures,we can achieve 2.30%test error rate with 14.3M parameters on CIFAR-10 and 74.6%top-1accuracy on ImageNet in the mobile setting,demonstrating the effectiveness and transferability of our designed architectures.
Keywords/Search Tags:Neural Architecture Search, AutoML, Network Transformation
PDF Full Text Request
Related items