Font Size: a A A

High-efficiency Deep Neural Networks For Object Recognition And Detection

Posted on:2021-02-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X T ZhuFull Text:PDF
GTID:1368330602994261Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Object recognition and object detection are two fundamental tasks in the computer vision field.The definition of object recognition task is to predict the correct object class for a given image,while the definition of object detection task is to predict both the category as well as the bounding box of each object.With the rapid development of deep learning in recent years,using deep neural networks for object recognition and de-tection brings significant performance improvement.However,when deploying these high-performance object recognition and detection models to resource-constrained plat-forms,the large number of parameters and high computation complexity within deep neural networks becomes the bottleneck.In order to expand the application range of deep neural networks,in this paper the high-efficiency deep neural networks for object recognition and detection is investigated from the following two directions:1)Per-forming neural network compression on a given network architecture;2)Design a new architecture that better fits the requirements of the target task.Our contribution can be summarized into three folds:First,for object recognition models that have redundancy caused by correlation among filters,a decorrelation regularization method for sparse neural network training is proposed,a higher sparsity is achieved when using decorrelation regularization jointly with sparsity regularization.By investigating the filter correlation in a sparse neural net-work trained by the existing sparsity regularization method,it's been found that there is still room for further compression.Then the decorrelation regularization is introduced to reduce the filter correlation during training,and the sparse mask is proposed to avoid interference when using decorrelation regularization and sparse regularization simulta-neously.The decorrelation weight initialization for providing a better initial state for decorrelation training is also introduced.Experiment results on several object recog-nition datasets show the proposed method has a higher compression rate than existing sparsity regularization methods.Second,To handle the corner feature misalignment problem in corner-based object detection models,a corner enhanced knowledge distilling method is proposed,by pro-viding better corner feature supervision signal,the detection model is able to achieve a higher detection performance with comparable model complexity,by which the applica-tion scope of knowledge distilling method is expanded.Since accurate corner detection is critical for corner-based object detection models,in the proposed method the object global information is used to enhance the corner features of the teacher network,then the enhanced feature map is used as an extra supervision signal to train the student model.To further enhance the performance of the student network,the corner deformable con-volution which integrates the corner position information into deformable convolution is proposed.Compared with existing corner pooling operation,corner deformable con-volution has a stronger corner feature extraction ability.Experiment results demonstrate that the student network learns better corner feature using the proposed method.Third,To enhance the scale robustness of object detection models without largely increasing the model complexity,the Scale Decoupled Feature Pyramid Networks(SDFPN)for object detection is proposed.To reduce the supervision signal inter-fere of different object scales in feature pyramid networks,a multi-branch structure is used in the backbone model,and the high-level feature maps generated by different branches correspond to different object scales.The feature fusion is performed within each branch,and the feature maps after fusion are used to detect objects in different scales.Since the output feature pyramid kept the same with the original feature pyra-mid networks,the computation complexity for detection heads and post-processing is not increased.To provide more adaptability for different scales,the bi-linear interpo-lation is used for learning the convolution dilation rate,the Straight Through Estimator(STE)is also used to learn the integer value dilation rate for better speed and perfor-mance trade-off.Experiments on object detection datasets validate the effectiveness of the proposed method on different object scales.
Keywords/Search Tags:Object recognition, Object detection, Deep neural networks, Decorrelation regularization, Knowledge distilling, Feature pyramid networks
PDF Full Text Request
Related items