| In recent years,various natural and artificial disasters have occurred frequently,seriously threatening the safety of human life and property.Deployment of deep learning target detection algorithms to mobile robots can assist humans in rescue missions.However,due to the limited computing resources of hardware devices,and the fact that people trapped at disaster rescue sites are often buried and obscured,at different scales and with overlapping interference,it is difficult for current mainstream target detection algorithms to achieve high accuracy on the basis of high real-time detection.Based on the above problems,this paper selects YOLOv4(You Only Look Once)target detection algorithm as the benchmark network,and optimizes it with lightweight improvement,attention accuracy compensation and global context modelling to enhance feature information.The specific tasks are as follows:Firstly,aiming at the slow detection speed of the benchmark network,the backbone feature extraction network of YOLOv4 was replaced with the lightweight neural network Mobile Net V2 to reduce the number of network layers to improve computational efficiency.Through comparative experiments,it has been demonstrated that the lightweight network model proposed in this paper has significantly improved the detection speed,while reducing the consumption of computer memory resources to a larger extent.Secondly,aiming at the problem of accuracy loss after light weighting,Coordinate Attention,a channel attention mechanism with location information,is embedded into Mobile Net V2,the backbone network of the modified YOLOv4.This enables the final feature map extracted by the backbone network to contain the location information in the shallow feature layer,reducing the loss of location features.The YOLOv4 network has improved detection speed and further improved detection accuracy compared to the original YOLOv4 network.The regular convolution of the path aggregation module in YOLOv4 is replaced by a depth-separable convolution for the higher time-efficiency requirements of post-disaster relief.The channels of the feature map are first split to extract regional information,and then the individual channels are stacked to fuse the feature information between the different channels.The improved network further increases the detection speed of the network without increasing the amount of computation.To address the problem of feature loss when image feature information flows between the layers of the neural network,a global context modelling network is added between the backbone network and the path aggregation network,allowing the network to obtain globally relevant features even at a local location,improving the network’s ability to detect obscured persons.Finally,aiming at the problem of slow convergence during training of the original YOLOv4 network,the K-means++ algorithm was used to cluster the training data,and the anchor frame dimensions obtained from the clustering were input to the improved network.At the same time,the network was trained at multiple scales in order to improve its ability to detect people of different scale sizes.It also proposes a data enhancement method that changes the colour difference of the RGB channel of the image to improve the network’s ability to detect images captured in different weather and different lighting. |