| Image semantic segmentation is a research hotspot in the field of computer vision.In recent years,convolutional neural networks have been widely used in the research of semantic segmentation,and achieved remarkable results.Aiming at the problem that the region-based semantic segmentation method is easy to lose detailed information when performing semantic segmentation,the result of image semantic segmentation is rough and the accuracy is low.This paper proposes a semantic segmentation method that combines context features with multi-layer feature fusion of convolutional neural networks.Firstly,selective search methods are used to generate candidate regions of different scales from images.Selective search[1]uses graph-based image segmentation methods[2]to generate many sub-regions,based on the similarity between sub-regions.Perform regional iterative merging to eventually output all possible areas of the target.In addition,when classifying regions,free-form foreground features and context features are combined to better obtain the actual foreground information of the region foreground.Secondly,the first five layers of the VGG16 network are used as the basic network to extract the image feature map.The feature map extracted by different layers is fused by refineNet,and the pre-trained VGG16 network is divided into five modules according to the resolution of the feature map,and then Right,the five modules are merged as five paths through the RefineNet module,and finally a fine-tuned feature map is obtained.Finally,the candidate region mask and the fusion feature map are input,and the segmentation image is output.The mask image of the candidate region and the feature map of the different layers are input into the pooled layer of the free-form region of interest to obtain the features of the candidate region.Finally,the candidate region is classified by the all-connected layer with the abstention,and the softmax is obtained.Experiments show that the proposed algorithm makes full use of the foreground information and context information of the region and the image feature information extracted by different layers,which can achieve accurate,fast and effective segmentation,and has strong robustness. |