Font Size: a A A

Instance Segmentation Algorithm Research Based On Multi-Scale Detecter And Mask Evaluation Network

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhangFull Text:PDF
GTID:2428330605454244Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Image instance segmentation is an important development direction in the field of artificial intelligence and image recognition.Its pixel-level object segmentation is widely used in industrial production,medical and health,social security and other fields.The traditional image segmentation algorithm with low accuracy is easily affected by the deformation,overlap,illumination and so on.Although the method based on convolutional neural network has achieved high accuracy in object detection,semantic segmentation and other tasks,there are still some limits.On the one hand,due to the large difference of the object scale in the image,the detection accuracy is reduced and the segmentation result is inaccurate;on the other hand,the pooling always reduce the size of feature map in the semantic segmentation stage.The classification of pixel and feature mapping by full connection layer are inaccurate.In addition,the traditional instance segmentation model can hardly evaluate the complete of the object mask,which makes the low accuracy of generated instance mask.Based on the existing instance segmentation algorithm,this paper solve the problem of large scale difference and incomplete mask generation by the deep learning methods such as MTCNN and DNN.The main work of this paper is as follows:(1)This paper proposed a multi-scale detector,which can used to extract the object features with different scales in the feature map.It also can reduce the influence of the large size difference and improve the accuracy of the instance segmentation.The traditional segmentation algorithm based on deep learning usually uses a single convolution kernel size in the feature extraction stage,which is difficult to extract the object features of different sizes.In the process of pooling,most of the small-scale objects are usually cause gradient decline.It also result in inaccurate or false detection,which masks difficulties to the mask generate network.Based on this,this paper proposed a multi-scale detector which combines the residual structure and convolution layers with different kernel sizes.The MSD can provide complete feature reconstruction for each pooling output at the top-down pathway of the backbone.These reconstructed features form the next stage of input.MSD can increase the receptive field of convolution kernel and extract the features with different scales by fuse the features in different scales.In order to reduce the influence of the input image with different scale and improve the detection accuracy,this paper presents an improved spatial pyramid pooling method.Due to the size of input image in the traditional convolutional neural network is usually fixed,the same class of instances in the training samples are deformed and the feature extraction is incomplete,it leads to low recognition accuracy.In this paper,local features are mapped to different scale space and fused by deconvolution,so that the convolution neural network can adapt to different scale image input.(2)In the mask generation network,the semantic features extracted from the mask generation network and the category features in the feature extraction network are fused in multiple levels.In addition,this paper proposed a mask generation network based on mask evaluation.The mask generation network is the last stage of instance segmentation,which uses the semantic segmentation algorithm to judge the class pixel by pixel.The input of this semantic segmentation network is usually the output branch of the feature extraction stage.The back propagation by the loss function is weak in this stage due to the small size and high dimension feature map,which leads to the wrong classification of pixel in the case of partial occlusion or overlap of the instances.Firstly,this paper takes the feature information of different stages as the input of semantic segmentation network and proposed a mask evaluation network to generate mask score.The proposed model combines mask score with loss value for back propagation and the weight of mask generation network can be updated.
Keywords/Search Tags:Instance Segmentation, MTCNN, Multi-scale detection, spatial pyramid pooling, Mask evaluation
PDF Full Text Request
Related items