With the rapid development of communication technology and computer hardware technology in recent years,due to its small size,high flexibility,and strong adaptability,UAVs are used in military reconnaissance,attack,early warning,civil transportation,meteorology,Agriculture,disaster relief,entertainment and other fields have a wide range of applications.When the drone is flying autonomously,it needs to automatically complete navigation,obstacle avoidance,positioning,etc.;when using the drone for reconnaissance at the disaster relief site,it is necessary to quickly make judgments on the disaster situation and make decisions;when the drone automatically landed,it needs precision.Find the location;these need to rely on drone vision to achieve automatic environment awareness.Therefore,how to implement environment awareness efficiently and accurately in complex scenarios is a hotspot in the field of UAV vision.The current rapid development of deep learning technology provides a good solution for UAV environment perception,using pixel-level image semantic segmentation technology to make the UAV’s autonomous environment understanding possible.However,the existing image semantic segmentation algorithms are all aimed at improving the accuracy,and the performance in real-time is not good,and can not be directly applied to the UAV environment-aware scene.This dissertation focuses on the above problems,and carries out research on efficient real-time image semantic segmentation algorithms based on deep learning for drone vision in complex scenarios.This dissertation designed the algorithm flow and constructed the network structure of the modules.In the reference network part,for the problem of lower accuracy,the residual structure is introduced to deepen the network layers,and the hole convolution is introduced to increase the receptive field.At the same time,deep separable convolutions are used instead of ordinary convolutions to separate channels and reduce the amount of calculations to solve real-time problems.In the feature enhancement part,in view of the complex feature pyramid structure of the existing model,this dissertation introduced a small amount of attention mechanism to enhance the feature maps at different scales of the reference network.At the same time,a simple convolution structure is used to introduce global information to enhance the feature map.In the prediction module,all feature maps of different scales are stitched and fused,and then the preliminary result prediction is used to optimize the results again using the attention mechanism module.The experiments on multiple datasets verify the rationality,efficiency and robustness of the network in this dissertation. |