Research On Neural Network Training And Architecture Optimization Based On Gradient And Evolution

Posted on:2024-07-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H C Zhang

Full Text:PDF

GTID:1528307076480684

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Over the past few decades,Deep Neural Network(DNN)have shown significant potential in computer vision,such as image classification,object detection,and image segmentation.When training a DNN,the backpropagation algorithm calculates the gradient of the loss function and minimizes the loss by adjusting the weights of the network.There are many gradient-based optimization strategies to refine backpropagation,among which Stochastic Gradient Descent(SGD)is a typical strategy.However,when optimizing DNN based on SGD,its loss function is still difficult to converge to a satisfactory minimum value,which seriously affects the performance of DNN.Therefore,many efforts have been made to develop variants of SGD.In order to alleviate the endless hyperparameter adjustment of designing neural networks,researchers proposed Neural Architecture Search(Neural Architecture Search,NAS),which proved to be a very promising method,which can regard the process of designing neural network architecture as an optimization problem.This automatic and efficient method for designing neural network architectures has had a major impact on the development of neural networks.Although significant progress has been made in trying to solve the problems of deep network training and architecture optimization,there are still some major problems to be solved,especially the improvement of SGD performance and the lightweight of neural network architecture.Based on the relevant theories of gradient and evolutionary computing,this thesis conducts in-depth research on the weights of neural networks and the architecture of neural networks and proposes a series of novel methods to solve the above problems.The main research work and contributions of this thesis are as follows:(1)In the study of optimizing deep neural networks based on neuroevolution,the optimization of the loss function only considers the selection of the optimizer by neuroevolution,which inadvertently makes it difficult for the loss function to converge to a satisfactory minimum value.This thesis combines the advantages of the gradient-free neuroevolution and SGD,and proposes a method named Neuro Evolution Stochastic Gradient Descent(NE-SGD)to optimize the training process of deep neural networks.This method optimizes the calculation pipeline of the deep neural network from the perspective of neural network weight optimization.At the same time,in order to improve the excessive similarity between solutions in NE-SGD,a suppression method based on hierarchical clustering is also proposed to improve the performance of the algorithm.(2)To solve the problem that the architecture complexity and the structural redundancy between the shared block and the target network are not considered in the process of gradient-based differentiable architecture search,a differentiable neural network architecture search method based on continuous relaxation search space for multi-objective is proposed.This method builds lowcomplexity,high-performance convolutional neural network blocks.Meanwhile,an evolutionary optimization method based on non-dominated sorting is introduced to build a low-complexity,highperformance target network based on shared blocks.The proposed method reduces the gap between the proxy network and the target network and helps designers build complete neural network architectures for deploying neural networks in devices with limited computing resources.(3)In some studies,it can be found that the Visual Transformer(Vi T)lacks robustness for object recognition in complex scenes.One possible reason is that the Multi-headed Self-attention(MHSA)in Vi T pays more attention to the relationship between pixels,which makes it lack the understanding of multi-dimensional features,such as the movement and scaling of objects.transsexual.Therefore,this thesis considers the basic idea of gradient optimization in the first two parts to solve these problems and proposes the Vision Transformer with Convolution Architecture Search(VTCAS)algorithm.The resulting neural network block architecture combines the convolution(Conv)operator with the MHSA operator.Based on the above algorithm,a backbone network with multi-scale output is obtained,which has good classification performance.The proposed proxy network based on the MHSA operator and Conv operator can directly perform neural architecture search on the Image Net dataset.At the same time,the proposed algorithm effectively solves the gradient discontinuity problem caused by the feature map dimension conversion between the MHSA operator and the Conv operator.The network obtained by VTCAS has achieved excellent performance in image classification and object detection.Especially in lowlight scenes,it has a strong ability to recognize objects in multiple scenes.(4)Finally,this thesis conducts a deeper exploration of the object detection task based on the evolutionary optimization ideas of the previous two parts.This thesis reconsiders the Single Path One-Shot Neural Architecture Search(SPOS)for mobile backbone network optimization and designs an evolutionary optimization method based on non-dominated sorting for the NAS application background.This thesis proposes a Mobile Non-dominated Sorting Genetic Algorithm Neural Architecture Search(MNSGA-NAS)based on the evolutionary algorithm of non-dominated sorting.This method utilizes the Yolo X-based search framework to search for the backbone network architecture of the object detection task.At the same time,this method establishes a family of search spaces based on Ghost Net operators according to the characteristics of mobile devices.Finally,a supernet weight mapping method that combines pruning and architecture search properties is proposed.The proposed supernet mapping algorithm can optimize the computational performance and computational complexity of the backbone network from three aspects: network depth,network architecture,and network output channels.The model architecture searched by the proposed algorithm consumes fewer computing resources.These models can be quickly deployed to mobile devices such as Jetson.

Keywords/Search Tags:

Deep networks, neural evolution, evolutionary computation, neural architecture search, image classification, object detection

PDF Full Text Request

Related items

1	Evolutionary Computation For Deep Neural Network Architecture Search
2	Quantum And Deep Neural Architecture Search By Quantum Evolutionary Algorithm For Image Classification
3	Research And Application On Neural Network Architecture Search Based On Evolutionary Algorithm
4	Evolutionary Neural Architecture Search Based On Improved ConvNet And Attention Mechanism
5	Evolutionary Neural Network Architecture Search Methods For Polarimetric SAR Image Classification
6	Research On Neural Network Architecture Based On Evolutionary Computation And Gradien
7	Self-adaptive Neuro Evolution Method For Deep Neural Networks In Computer Vision
8	Research On Extended Search Algorithms For Differentiable Architecture Search
9	Evolutionary Algorithm-Based Neural Architecture Search
10	A Study On Fast Neural Architecture Search For Affinity Chip Microarchitecture