With the development of deep learning and the improvement of machine computing power,computer vision research is moving towards a more refined and accurate direction.Traditional segmentation algorithms can only segment a whole class of images,while segmentation algorithms in the era of deep learning can perform instance-level segmentation and even accurate component-level segmentation of objects of interest in images.Although instance segmentation algorithms have achieved high accuracy in natural scenes,existing algorithms often fail to meet the requirements when occlusion occurs,which is especially evident in instance segmentation algorithms for human categories.Therefore,this paper focuses on the research on instance segmentation algorithms in occlusion scenes.For the special category of human body,by constructing occlusion scene human body instance segmentation datasets and optimizing the architecture of instance segmentation algorithms,the human body instance segmentation for occlusion scenes is compared with non-occlusion scenes.Pattern instance segmentation task is studied.In this paper,we first construct a human instance segmentation dataset for occlusion scenes.Aiming at the problem that the general instance segmentation dataset cannot reflect the segmentation effect of human instances in occlusion scenes,this paper,based on the Baidu portrait segmentation dataset,artificially constructs occlusions and synthesizes the training set OHIS-train and the validation set OHIS-val of human instance segmentation in large occlusion scenes.In addition,based on the public data set OCHuman,this paper manually filters and constructs a test set OHIS-test for human instance segmentation in real occlusion scenes.Second,this paper proposes a single-stage instance segmentation algorithm EM-BUF that fuses bottom-up features.For the instance segmentation task of occlusion scenes,this paper compares the performance of existing instance segmentation algorithms on multiple occlusion scene instance segmentation datasets,and selects the single-stage instance segmentation algorithm Embed Mask as the baseline model of this paper.In addition,this paper improves the performance of the algorithm on instance segmentation tasks by increasing the resolution of pixel-encoded feature maps and fusing bottom-up features.The experimental results on the public dataset COCO show that the EM-BUF proposed in this paper can improve the average accuracy by 1.1%compared with the baseline model,and the results on the OHIS-test show that the EM-BUF achieves the highest average accuracy.Thirdly,this paper proposes a non-pattern instance segmentation algorithm EM-NLA based on non-local attention.Aiming at the need for a larger field of view in the non-pattern instance segmentation task,this paper combines the non-local attention mechanism on the basis of Embed Mask,and adds a non-pattern pixel feature encoding module,and proposes a non-pattern instance segmentation algorithm EM-NLA.The results on the non-pattern instance segmentation dataset KINS-test show that EM-NLA achieves a 2% improvement in average accuracy over the baseline model.The experimental results on OHIS-val show that EM-NLA achieves the best performance on the task of human non-modal instance segmentation in occlusion scenes. |