Font Size: a A A

Research On Neural Architecture Search Based On Gradient Optimization

Posted on:2022-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:B F ZhangFull Text:PDF
GTID:2518306557467914Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Convolutional Neural Network(CNN)is an important support for artificial intelligence,especially computer vision tasks,and has made major breakthroughs in recent years.With the enhancement of the computing power of the equipment,many excellent neural network models have been proposed.However,it is not difficult to find that convolutional neural networks tend to have deeper layers and more complex structures.This leads to the proposal of excellent neural networks that often require more professional knowledge and prior experience such as a large amount of model training time,and complex and cumbersome hyperparameters tuning process.In order to reduce the research difficulty and application threshold of neural networks,more and more experts and scholars pay attention to Neural Architecture Search(NAS).In recent years,NAS methods have emerged one after another,but most of the work requires huge computing resources.The NAS based on gradient optimization has the advantages of simple process,small calculation,and excellent performance.This academic dissertation analyzes and summarizes the advantages and disadvantages of the existing NAS methods,and proposes several improved techniques and strategies based on the differentiable architecture sampling algorithm in the gradient optimization method.Through experiments on popular data sets,the effectiveness of the improved method was verified,and a model architecture comparable to artificially designed neural networks and the search results of most existing NAS methods was achieved for image classification tasks.First of all,this dissertation points out that the Gradient-based search using Differentiable Architecture Sampler(GDAS)algorithm uses one-hot type Gumbel-Soft Max to sample a small sub-network for training each time,which leads to insufficient pre-training of the supernet,and most of its operations cannot be trained and are still in a random initialization state.Premature convergence of architecture parameters makes the search direction tend to stress operations with less training.In order to solve this problem,this paper proposes a NAS algorithm based on warm-up.The algorithm only trains the operation parameters in the sampled sub-networks in the early training period,and delays the time to start training parameters,so as to make up for the shortcomings of insufficient pre-training of GDAS and improve the accuracy of the searched model.Secondly,GDAS is improved on the basis of DARTS,and it still has the problem of prone to skip connection operations.This not only leads to a decrease in the accuracy of search results,but also makes sampling and training a large number of meaningless sub-networks containing too many skip connections during the search process,which wastes computing resources.In order to solve these problems,this paper proposes a skip-controller based on the pre-trained supernet to ensure that each sampled sub-network has an appropriate number of skip-connections,and eliminate the sub-networks without potential optimal solutions,resulting in an improvement of the GDAS algorithm.This not only reduces the search space by avoiding wasting resources on meaningless sub-networks,but also solves the problem of excessive skip-connections,and improves the accuracy and stability of the derived model.Finally,the cell-based NAS algorithm has a huge search space,but GDAS only samples one sub-network architecture per iteration.This leads to the number of sampled sub-networks being much smaller than the search space,and insufficient exploration of the search space.Based on the use of a skip-controller to reduce the search space,this paper adjusts the ratio of the training data set and the verification data set to increase the number of sampling sub-networks by 60%.Compared with the GDAS algorithm,the search space of the improved algorithm is smaller but the number of sampling sub-networks is larger,which greatly increases the exploration ratio of the search space,helps to search for the global optimal sub-network architecture and reduces the difficulty of searching.This dissertation conducts experiments on three popular classification data sets,CIFAR-10,CIFAR-100 and Image Net.The experiments verify that the proposed algorithms improve the accuracy of the derived model and the stability of the search results,exceeding the previous three based on Gumbel-Soft Max gradient optimization method,Stochastic Neural Architecture Search(SNAS),Facebook-Berkeley-Nets(Fbnet),and GDAS.
Keywords/Search Tags:Neural Architecture Search, AutoML, Gradient Optimization, Image Classification
PDF Full Text Request
Related items