| As a critical supporting technology in the field of computer vision,convolutional neural networks(CNNs)continue to achieve innovations and breakthroughs.As the network structure is essential to the performance of CNNs,researchers have been con-stantly proposing new network structures to improve their performance,and the testing accuracy on many public datasets keeps setting new records and even approaches satu-ration.In addition,robustness,a key indicator for assessing the reliability and stability of a model,is of great importance in the practical application of CNNs,and thus the is-sue of robustness of CNNs has attracted widespread attention in academia and industry.This thesis mainly focuses on optimizing and improving the network structure of CNN to enhance the model’s generalization and robustness.The details involved are listed as follows:(1)Based on the random graph model,a new network construction pattern is ex-plored by combining the depth-first traversal algorithm with the network construction.Based on this pattern,the depth-first neural network(DFNN)is proposed.Specifically,a random graph is generated using the random graph model,and a depth-first traver-sal is performed on the random graph.The edges and nodes traversed in the process constitute the backbone of the DFNN,while the remaining edges are used as skip con-nections to transmit historical features.The nodes in the random graph perform specific convolution operations.Thus,the random graph is transformed into stages,and stages connect to each other,forming the DFNN.This network structure retains the character-istics of the random graph model and also avoids the complicated manual design of the network.The trainable weights on skip connections enable the model to adapt to dif-ferent image classification tasks.In classification tasks on CIFAR and SVHN datasets,the DFNN achieved higher testing accuracy compared to some classical CNN models.Furthermore,experimental results on a noisy dataset validated that the DFNN has better robustness compared to networks designed using the hierarchical structure.(2)The sharpness-aware minimization(SAM)optimizer has achieved significant success in improving model generalization by seeking flat minima.This thesis pro-poses a new CNN model called SAMNet by applying the optimization process of SAM optimizer on the forward propagation through referencing it from the backward prop-agation.Mathematical proofs and numerical simulation experiments demonstrate that SAMNet has better robustness compared to the residual network(ResNet).Moreover,the performance of SAMNet is evaluated on the CIFAR and SVHN datasets by adding noise to the test set and conducting adversarial training.Experimental data shows that,with comparable model parameter quantities,SAMNet exhibits better robustness.(3)The stochastic gradient descent(SGD)algorithm gradually reduces the learn-ing rate during the training process,enabling the model to converge more smoothly in the later stages of training.Inspired by the learning rate decay strategy in the SGD al-gorithm,this thesis proposes a dynamic step size adjustment strategy and applies it to the optimization of ResNet based on the explicit Euler interpretation of ResNet.This thesis theoretically demonstrates that the small step sizes have the positive impact on improving the robustness of ResNet.Additionally,this thesis optimizes the step size in ResNet by using a dynamic step size adjustment strategy based on different learning rate decay strategies.Experimental results show that the dynamic step size adjustment strategy significantly improves the generalization,robustness,and training robustness of ResNet. |