| Deep convolutional neural networks(DCNNs)are eye-catching models of deep learning algorithms that have high success in many real-time computer vision and image processing tasks.However,the advanced DCNN models,for instance,the family of ResNet and VGG are not memory friendly with billions of learning weights,and requirements of billions of floating-point-operations(FLOPs).These unwieldy models suffer a reasonable inference cost,especially,when embedded with battery-constrained systems,in which the power usage is greater and computational capability may not be sufficient.The application of DCNNs in the real world is usually affected by the model size,memory footprints,and number of FLOPs.The large amount of power of DCNNS comes from their learning weights which are considerably in millions,and those learning weights with model structure data must be saved into the RAM and loaded on a hard drive during the inference time.Further,the operation in the middle activation layer probably will even consume more RAM than loading the model weights during inference,even with only one input image.On the other hand,the convolutional operation on high-definition images is computationally expensive and obstructs it from practical applications.Hence,it is important to shrink the DCNN model size,where power consumption is considerably lower and computing power is higher with similar accuracy for real-world problems.In this thesis,we introduce several solutions for the abovementioned problems.The main contributions and research of the thesis are summarized as follows:1.We show that the capped L1-norm can be combined with a regular L1-norm,as a result,the blend of both norms can control the tradeoff between filter selection and regularization.The filters of nearly all layers with small L1-norm are picked and set to naught.This assists the channel-level pruning in the next step.Further regularization parameters hardly affect the performance.Pruning insignificant channels can temporally scale down the performance.However,this effect could be compensated by fine-tuning the pruned network.Right after the pruning process,the result in a slimmer network is far more compact concerning runtime memory,model size,and computational cost compared to the original wide architecture,and the process can be iterative.Further,our method pruned 56.4%of the parameters with no loss of accuracy,outperformed other state-of-the-art approaches,such as entropy-based approach pruned 34.4%of the parameters and no loss in accuracy.Experiments on multiple datasets show that the above modules can reduce the size of the model and achieve better accuracy.2.A modified version of the LASSO penalty,also known as the group LASSO penalty in the linear regression task,is significantly applicable for filter pruning.The group LASSO penalty can be used to penalize sparsity on a group level,for example,all the sparse groups are selected and set to zero.Further,an additional modification,known as the sparse group LASSO,can also be applied to enforce further sparsity on the group of non-sparse variables.Therefore,in this thesis,we assume a single feature map as a single group and weights in a feature map as group values.In this way,this optimization method can be applied to eliminate the complete feature map as well as the weights in the feature map.The proposed method produces a slim architecture with speedup inference and compacted model size.Experimental results show that the above mechanisms can effectively solve model size problems and obtain similar accuracy as an unpruned model.3.We present a correlation-based filter pruning(CFP)approach to train more reliable CNN models.Unlike several available filter pruning methodologies,our presented approach eliminates useless filters according to the volume of information available in their related feature maps.We apply correlation to compute the duplication of information carried in the feature maps and created a feature selection scheme to obtain pruning approaches.Pruning and fine-tuning are cycled many times,producing slim and denser networks with similar accuracy to the original unpruned model.We practically calculate the success of our technique with various state-of-art CNN models on many standard datasets.Specifically,for ResNet-50 on ImageNet,our approach eliminates 44.6%filter weights and saves 51.6%Float-Point-Operations(FLOPs)with 0.5%accuracy gain and obtained state-of-art performance. |