Font Size: a A A

Research On Extended Search Algorithms For Differentiable Architecture Search

Posted on:2024-03-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:L F WangFull Text:PDF
GTID:1528306944470204Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Deep neural networks are the cornerstone of deep learning.In recent years,neural architecture search has become a hot topic since it can automatically obtain neural networks.The cost of the earlier two-stage search is enormous due to the separate sampling and evaluation.Instead,a one-shot search combines sampling and evaluation into one phase.It does not require training the sampling architecture from scratch.Therefore,its search cost is significantly reduced.In particular,DARTS transforms the search into the optimization of the super network.It uses gradients to optimize weights and architectural parameters,making the optimization directional.Since low cost and high performance,it has been widely concerned.However,some high-performance architectures are buried due to the limitation of search space.The differential search still needs help with instability and significant quantization error.To this end,this study constructs different enlarged spaces and proposes corresponding search algorithms.The algorithms alleviate adverse factors as much as possible.The achievements and innovations are summarized as follows.(1)This study proposes a factorized search algorithm.It can effectively search in an enlarged space and avoid competition between the same convolution operators.The research first constructs a search space of 40 operators.These operators are formed by the Cartesian product of 9 new activation operators and the original 8 regular operators.The search cost of existing differentiable algorithms will increase exponentially with the number of operators.At the same time,it faces competition among the combination operators with the same convolution.Hereto,this research proposes a factorized algorithm.It first factorizes search space into regular and activation space.It relaxes subspaces’architectural parameters to form continuous-space representation and updates model weights and corresponding architectural parameters sequentially by gradient descent in the order that reverses the feature transfer.The architectures obtained by factorized algorithm achieve 97.62%and 76.2%top-1 accuracy on CIFAR and ImageNet,35.8%(AP)and 55.7%(AP50)on COCO when the pre-trained model is used as a backbone.(2)This research proposes a regularized multiple cells search algorithm to balance search freedom and resource allocation.Most DARTS-based methods use an 8-cell supemet stacked by 2 cells(2 c.).This makes the structure of the network consistent at different depths.However,their structures should also differ for the different effects at different depths.Therefore,this research constructs a super network stacked by 5 cells(5c.),allowing normal cells at different stages and two reduction cells to differ.It can be found that(5c.)is better than(2c.)in performance,but there is an extreme resource allocation problem.For this reason,Reg distance is proposed to measure the difference between cells.The architectural parameters are updated by the validation loss composed of cross-entropy loss and the weight of Reg.The weighting coefficient is adjusted adaptively.The regularized algorithm adaptively adjusts the weighting coefficient.Finally,regularized multiple cells search realizes reasonable allocation and achieves 97.64%and 75.8%top-1 accuracy on CIFAR and ImageNet.(3)This study proposes a progressive pruning algorithm.It alleviates the huge errors caused by architectural transfer and one-time pruning.The network obtained from the DARTS search is inconsistent with the network used for target data.This will result in migration errors.Therefore,this study uses a 14-cell network to search.It uses the searched network directly for the target data.In addition,a large number of operators are cut at last.The strength of many sheared operators is high.This will cause vast quantization errors.To this end,it proposes a progressive pruning search.The algorithm adopts bi-level optimization.The first layer updates weights through training loss.The other layer is to update the architectural parameters through the weighted cross-entropy and pruning loss.After each iteration,it removes operators with tinny strength and adjusts the weighting coefficient.The complexity of the network decreases monotonously with the search.Finally,a series of networks are obtained.The models with high complexity achieve remarkable performance.(4)This research proposes an M2NAS algorithm for optimizing both macrostructure and microstructure.The existing differential works almost focus on searching microstructures in micro space,lacking in macrostructures.However,the macrostructures of different networks should be different.Therefore,this research constructs the macro-micro search space by releasing constraints in each cell and allowing reduction cells to appear anywhere.It initializes a network with maximum complexity and iteratively optimizes the macrostructure and microstructure.For macro search,it proposes methods to generate candidate macrostructures and transmit parameters to candidate networks.For micro search,the progressive pruning method is used.Then compare the current and candidate networks,preserve the better macro-micro structure,and update it to the current.Finally,excellent macro-micro structures are obtained to form the Pareto set.The algorithm achieves a better balance between complexity and accuracy.The architecture searched on ImageNet can achieve 77.4%and 76.3%top-1 accuracy during training with or without data enhancement.The results of COCO further demonstrate the generalization performance.
Keywords/Search Tags:Deep Learning, Neural Architecture Search, Computer Vision, Image Classification, Object detection
PDF Full Text Request
Related items