In recent years,deep learning is a widely used technology in the field of artificial intelligence,and the high performance of deep learning is achieved by spending high memory cost and computational cost.Researchers propose model compression algorithms to speed up model inference and reduce memory space occupied by the model.The compression strategy studied in this paper is convolutional kernel pruning,the design of the convolutional kernel importance evaluation criterion is the core phase of the algorithm,which is crucial in the model rebuilding after pruning.The importance evaluation criteria design needs to consider the interpretability of the model and the correlation between convolutional kernels,otherwise,the important convolution kernels will be deleted by mistake and the compression model faces serious security problems in deployment.In this paper,the design of the importance evaluation criteria focuses on the interpretability and the correlation between convolutional kernels.The main work is as follows:(1)An interpretable pruning method for convolutional neural network is proposed:the purpose of convolutional kernel pruning is to filter out redundant convolutional kernels to reduce the number of parameters and the computational cost of the model.In this research,redundant convolutional kernels are defined as convolutional kernels that are not fitted to task-relevant features.In order to remove redundant kernels,interpretable methods are introduced to quantify the extent to which convolutional kernels learn taskrelevant features.In addition,a gradient flow strategy is introduced to prune the convolutional kernels.The method splits kernels into significant and redundant parts according to importance and compression rate in each training iteration and utilising distinct updating algorithms for different parts.For the unimportant parameters,this strategy truncates the back-propagation gradient of the objective function and only makes its weight decay gradually until it reaches zero.Experimentally,it is demonstrated that the compression model obtained by this method has significant improvement in performance and compression ratio and other indicators,and can effectively retain the task-related features fitted to the original model.(2)A convolution neural network pruning algorithm based on graph structure is proposed: while evaluating the importance of convolutional kernels,existing methods only consider the contribution of the convolutional kernels themselves,and ignore the repetition between the fitted features of the convolutional kernels,which may lead to bias in the importance evaluation.Considering that the relationship between the fitted features of convolutional kernels is more complex and cannot be well represented by the traditional Euclidean space,this paper uses graphs to describe the correlation of the fitted features between convolutional kernels.In this paper,a novel graph construction algorithm is proposed to build a graph based on the similarity within the fitted features of convolutional kernels in the same layer.Then a redundancy evaluation is applied to each layer using a redundancy evaluation criterion to derive the redundancy degree of each layer.Finally,the algorithm uses the graph convolutional neural network to introduce association information into the selection of redundant convolution kernels,and each redundant convolution kernel selection is performed in the most redundant layer.Experimentally,it is proved that this method can effectively improve the performance of the compression model as well as the compression rate. |