| With the rapid development of information technology,in the field of digital image processing,general object detection and recognition has become one of the research topics in recent years.The general object detection and recognition algorithm has made great progress.However,in specific application scenarios,the general object detection and recognition algorithm may not be effective due to changes in the environment,the scale of the target and the perspective of the target.For example,traffic signs photographed on the road are usually smaller in size and varied in category.The direction of targets in aerial images is different and the scale varies greatly.The shapes of the text in the natural scene are strange and all-embracing.These three special targets have a lot of small targets(traffic signal less than 32 × 32 or short text edge less than 24 pixels),a variety of target sizes and a variety of shapes relative to the current popular data set used for target detection.Therefore,the performance of the existing general object detection algorithm is not ideal,it is necessary to study the object detection and recognition algorithm for such special small targets.This thesis conducts an in-depth study on the object detection problems in three complex scenes:traffic signs,aerial photography targets and scene text.According to different scenes,the algorithm of small target detection and recognition is designed.The research content and main contributions include the following aspects.First,a convolutional neural network based on divide and conquer for traffic sign detection and recognition is proposed with detection and recognition respectively.This thesis fully analyzes the shortcomings of the two-stage detection and the characteristics of traffic signs.The detection task and the recognition task are separated into two sub-tasks.In order to identify the difficult detection problem,the Faster R-CNN is improved as the detection sub-network.It uses a lean backbone network and careful anchors to speed up detection without losing detection rate.A large scale identification sub-network is proposed.By sharing some convolution parameters with the detection sub-network and pooling large scale areas of interest,it retains the rich shallow features of the target and makes it competent for more than 200 kinds of detailed classification tasks.Based on separation structure,this thesis also designs a multi-batch training framework,which greatly improves the recognition performance.The Accuracy of the algorithm in TT100K and CTSD reached 93%and 98.3%,the Recall reached 94%and 98.7%,and the speed reached 10.25FPS.Second,a cascade multi-direction target detection algorithm in aerial images is proposed.Inspired by cascade locating the target boundary from Cascade R-CNN,a multiinformation cascade output terminal is designed.According to the difficulty of the task changes in the process of target location and the richness of target annotation information,it realizes the positioning of the multi-direction boundary box of the target through the idea of step by step positioning.A cascade classifier is proposed to extract target shape features based on location information and combine target texture color features.The target is effectively classified by a fine-to-fine identification step.In addition,a short sample expansion scheme is proposed to solve the problem of unbalanced training samples.This algorithm has reached 72.81%and 89.68%mAP in DOTA and HRSC2016 respectively.Thirdly,a fast scene text detection algorithm based on pruning and knowledge distillation is proposed.Based on DBNet,it can get a fast text detection network by clipping the base method backbone network according to the scaling factor in BN layer parameters.In addition,inspired by the idea of knowledge distillation,it uses a teacher network with excellent performance to train the pruning network,so that the network can achieve both speed and detection performance.It was verified on Total-Text,MSRA-TD500 and ICDAR2015.And the F-score reached 83.3%,83.4%and 82.8%,respectively.The fastest speed was up to 95 FPS.Experiments show that the detection performance and calculation speed of this algorithm are better than most other methods. |