| Laparoscopic surgery is a very active research area in clinical medicine.Deep learning-based laparoscopic surgical instrument detection technology can help doctors identify instruments in surgical videos more accurately.However,factors such as large scale variation of surgical instruments、mutual occlusion and unstable light can affect the detection of instruments,resulting in a high rate of false detection,and high rate of missed detection in laparoscopic surgical instrument detection.Feature Pyramid Network(FPN)can effectively solve the problem of multi-scale object detection.However,there are some shortcomings of FPN for fusion of different scale features.In this paper,we mainly improve the accuracy of laparoscopic surgical instrument detection by fusing multi-scale features in different ways to obtain a feature fusion network with better performance.The main work is as follows:(1)To solve the problems of uneven distribution of categories and relatively small number of images in the m2cai16-tool-locations laparoscopic surgical instrument detection dataset,this paper adopts an oversampling method to expand the samples of the lesser number of categories,and uses traditional data enhancement methods to further expand the samples.These methods can effectively avoid the overfitting problem caused by the insufficient amount of data.(2)A multiple attention augmented feature pyramid network(MAFPN)is proposed and applied to laparoscopic surgical instrument detection,which can make full use of multi-scale features.First,a feature selection module(FSM)is designed to replace the convolutional block,which combines channel attention and global attention to selectively retain important information.Second,a self-attentive augmented fusion module(AAFM)is used to capture the global information of high-level features in the FPN.Finally,Dynamic Convolutional Decomposition(DCD)is used to mitigate the effects of upsampling while enhancing feature representation.Experimental results on the laparoscopic surgical instrument detection dataset m2cai16-tool-locations show that the average accuracy of MAFPN is 96.5% when the IOU is 0.5,which is 1.8% better than the baseline method Retina Net,and the average accuracy is improved by more than 1.6% over the comparison network.Compared with the state-of-the-art methods,the performance of the MAFPN proposed in this paper is more superior.(3)An asymmetric deep residual UNet network A-Res UNet(Asymmetric Deep Residual U-Net)is proposed and applied to laparoscopic surgical instrument detection.The network adopts an encoding-decoding structure,and the encoder side uses the Res Net50 network to extract features,and then the improved downsampling module and residual module are used to obtain features with richer semantic information.Next,an improved Spatial Pyramid Pooling(SPP)module is used to extract deep feature contextual information,and a residual unit is introduced in the decoder to enhance feature representation.After that,FSM is used to unify the number of feature channels.Finally,Retina Net detection head is used to localize and classify the objects.The empirical on the laparoscopic surgical instrument detection dataset m2cai16-tool-locations shows that the average precision(AP)of A-Res UNet reaches 97.1% when the Intersection Over Union(IOU)ratio is 0.5.Compared with the classical detection models such as Retina Net,FCOS,ATSS,and Faster RCNN,A-Res UNet can significantly improve the detection effect.(4)A Swin Transformer-based laparoscopic surgical instrument detection method is proposed.The method uses Swin Transformer network as the backbone network to extract image features,then uses FPN to fuse multi-scale features,and finally uses Retina Net detection head to classify and localize laparoscopic surgical instruments.This method is an attempt to experiment using transformer on m2cai16-tool-locations laparoscopic surgical instrument detection dataset.It has the advantage of using fewer parameters and lower computational complexity,but its accuracy is relatively low and needs further improvement. |