Font Size: a A A

Research On Remote Sensing Image Object Detection And Optimization Technology Based On YOLOv5

Posted on:2022-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:J H WangFull Text:PDF
GTID:2532307169478574Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of intelligent technology,the intelligent analysis and processing of remote sensing images has become a hotspot of extensive research and application.The application of remote sensing image object detection is one of the typical problems.The characteristics of remote sensing tobject’s variable direction,unbalanced sample category,and blurred edges of objects have brought huge challenges to target detection.Intelligent detection technology based on deep learning has become an effective way to improve detection quality.For image applications,various object detection frameworks and algorithms based on deep learning models have made technological breakthroughs in terms of detection performance,detection speed,resource cost,and scene adaptability.The one-stage object detection framework is due to its multi-scale learning capabilities and regularity of the models have become popular object detection techniques.According to the characteristics of remote sensing images,this paper selects the latest one-stage object detection framework YOLOv5 to study the model construction and optimization technology for remote sensing image object detection.First,in view of the dense and changeable direction of remote sensing objects,a rotating detection bounding box is selected for object detection.YOLOv5 is a detection algorithm proposed for natural scenes,using a horizontal detection bounding box.In order to adapt to the detection of rotating objects,based on YOLOv5,we added angle information,changed the output dimension of the model,and the loss function increased the angle classification loss,and built a rotation detection model named YOLOv5-FRD.And based on YOLOv5 m to create a model,training and testing on the DOTA dataset,the m AP(mean Average Precision)reached 71.78%,compared with the recent one-stage object detection algorithm,in the most categories Achieved the best AP(Average Precision).Secondly,the loss function is optimized according to the defect of angle classification loss and the characteristics of the imbalance of the remote sensing image sample categories.(1)The angle information is predicted by the classification method.The conventional classification label cannot judge the size of the prediction error.As long as the prediction is wrong,the same loss will be produced.Therefore,we use the circular smooth label to mark the angle information on the basis of the binary cross-entropy loss function to calculate the angle classification loss to improve the angle prediction performance.(2)In view of the imbalanced characteristics of remote sensing image sample categories,we add the Dice Loss function to the classification loss,and combine it with the binary cross-entropy loss function to propose a hybrid weighted classification loss.The use of the hybrid weighted classification loss,on the one hand,alleviates the impact of sample imbalance on the performance of the model,on the other hand,it maintains the training stability of the model.After loss optimization,the model effectively improved the detection accuracy of the five categories with the least instances on the DOTA dataset.At the same time,the m AP reached 73.19%,and the overall performance increased by 1.41%.Finally,remote sensing images have the characteristics of numerous small objects and blurred edges of objects.In order to enhance the model’s ability to detect small objects,the channel attention mechanism can be used to improve the model’s performance.However,most channel attention mechanism highlights obvious features,but it is easy to lose edge features or other information,resulting in inaccurate positioning.In order to improve the model’s localization capability,we adopted the multi-spectrum channel attention mechanism Fca Net.After in-depth research and analysis,we found that Fca Net did not fully utilize the multi-frequency components of the channel to extract features.Therefore,we propose a new attention mechanism network TFca Net based on the twodimensional discrete cosine transform,which realizes the use of multi-frequency components to extract features for each channel.Incorporating TFca Net into the YOLOv5-FRD model,the m AP on the DOTA dataset has reached 73.53%,and the performance is improved by 0.34%.
Keywords/Search Tags:Remote Sensing Images, Object Detection, YOLOv5, YOLOv5-FRD, Dice Loss, Attention Mechanism, FcaNet, TFcaNet
PDF Full Text Request
Related items