| Image instance segmentation technology is the image processing technology closest to human visual perception in the field of computer vision.Its main method is pixel level instance segmentation for regions and individuals in the image.The purpose of instance segmentation is to distinguish the objects in different regions in the image,distinguish the objects in the same region,and classify,locate and segment these objects,so as to achieve the purpose of deep learning and understanding of the expressed content in the image scene.With the successive proposals of SOLO V1 and SOLO V2 algorithms in recent years,the new concept of "instance category" has been introduced into the field of instance segmentation,and the instance segmentation problem has been transformed into the classification prediction of semantic categories and the classification prediction of mask categories.A simple and straightforward single-stage image segmentation network framework with high segmentation accuracy.However,the use of ResNet as the backbone network in the SOLO algorithm to extract image features has low computational efficiency in the process of capturing long-distance dependencies in the convolution operation.At the same time,the FPN feature pyramid network used in the feature fusion stage is used for the positioning information and edge information in the image.There are also deficiencies in extraction and fusion.Therefore,in response to the above problems,this thesis has carried out in-depth splitting and research work on the SOLO algorithm,and cleverly used non-local operations and feature fusion enhancement paths in the network framework of the SOLO algorithm to improve the segmentation of the algorithm.Effect.First,this thesis improves the efficiency of capturing long-range dependencies by adding Non-local operations to the Resnet backbone network.Specifically considering that the Non-local operation can improve the efficiency of convolution operations and solve the problem that the network layer is too deep due to the stacking of multiple convolution modules,which leads to the loss of image segmentation accuracy,this thesis applies Non-local in some residual blocks of ResNet.The block realizes long-distance multi-hop communication,and better retains the feature information of the image in the feature extraction stage.In addition,this thesis adds a bottom-up enhancement path to the FPN feature pyramid network to improve the propagation speed of localization information in the underlying feature map.Considering that the extraction of low-level feature information can effectively improve the recognition efficiency of large-scale targets,this thesis adds a feature fusion operation after the FPN feature pyramid network,which effectively shortens the information propagation path between low-level features and high-level features,and efficiently uses the bottom layer.The more accurate location information contained in the feature map improves the hierarchical structure of the entire feature fusion stage.The experimental results show that the improved SOLO algorithm in the public COCO2017 dataset and the Cityscapes dataset has obvious image classification accuracy and image segmentation accuracy compared with the original SOLO algorithm,which fully proves the work of this thesis effectiveness. |