Object detection is an important research in computer vision,and its main task is to find the coordinates and categories of instances on the image.Object detection models can be divided into anchor based models and anchor free models by whether anchors are used to extract candidates during the detection stage.The anchor free model reduces the manual influence because it does not use anchors,and has been continuously concerned by researchers in recent years.The anchor free model usually uses pixels on the feature map for detection,but the image undergoes multi-layer down-sampling during training,the features are gradually abstracted,and the number of pixels is gradually reduced.This not only affects the detection performance,but also causes task misalignment where high-quality boxes can not match high classification scores.Therefore,what proposed in this thesis is a new anchor free model based on FCOS.This model proposes an adaptive balanced feature pyramid network;uses a spatial attention module to select pixels,and improves the detection accuracy of the model.The main work of this thesis include the following three aspects:(1)A feature fusion algorithm based on an adaptive balanced feature pyramid network is proposed in this thesis.In order to obtain feature maps rich in object information,we adjust the feature maps of different layers to the same scale,and then use convolutional layers to obtain the corresponding weights of each layer,and multiply the weights of each layer with its feature maps and fusion them.Finally,the scale of the feature map after fusion is restored to that before fusion to form multi-scale feature maps.The pyramid network proposed in this thesis can be effectively transplanted to different object detection models.Compared with the FCOS model and CenterNet which only use FPN,the mAP using the feature pyramid network proposed in this thesis is improved by 1.2% and 1.5%,respectively.This proves the validity of the feature pyramid network proposed in this thesis.(2)Based on the FCOS,a sampling method based on attention mechanism has been proposed in this thesis.Since the proportion of a instance in the image is small,and FCOS detects instances by sampling all pixels in the image,a large number of negative samples will be introduced.In this thesis,we propose a spatial attention module to enhance feature maps.And set the threshold so that the model only samples the pixels in the spatial attention weight map whose weight value is greater than the threshold for detection.On the basis of the FCOS model,the mAP value of the model using only the attention module in this thesis has increased by 0.9%,and the mAP value of the model using the sampling algorithm in this thesis has increased by 0.5%,which proves that the sampling algorithm in this thesis can effectively improve the accuracy.(3)Combining the feature fusion algorithm proposed above and the attention based model,attention-based FCOS model is proposed and deployed in this thesis.On the basis of FCOS,the adaptive balanced feature pyramid network is used for feature enhancement,and the detection method of spatial attention sampling and deformable convolution are used to obtain coordinates and classification scores in the detection stage.The model is trained using the E Focal Loss during the training stage.It can be seen from the experimental results that the prediction box of the model in this thesis can cover the entire instance and obtain a higher classification score,which can effectively improve the quality of the prediction box of the model. |