Font Size: a A A

Research On Image Semantic Segmentation Based On Deep Learning

Posted on:2019-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:H D WenFull Text:PDF
GTID:2348330569987834Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
For being capable of approximating complex non-linear function,deep learning has become one of the most effective methods in large numbers of challenging tasks.However,for one hand,its achievements rely on a great scale of data.For the other hand,it requires to take advantage of the property of specific data when implementing.As one of the fundamental tasks in computer vision,image semantic segmentation can help others greatly for the object of classifying all pixels.Considering the above reasons,the topic of this thesis is image semantic segmentation based on deep learning,and mainly about weakly-supervised learning and prior-based ways.The labelling of semantic segmentation data demands much effort,so supervised learning is hard to adapt to the daily increasing data.This thesis is going to study weaklysupervised learning method that use image category label to finish pixel-wise segmentation.First,the convolutional neural network(CNN)that unify classification and segmentation with global pooling usually captures the more discriminative regions,which makes the result has problems of small object missing,details struggling and semantic relationship disordering.In order to resolve them,we propose to replace global pooling with spatial pyramid pooling(SPP).SPP has the ability to ensemble multi-scale context and contact or compare local with global information naturally,which has also been widely used.Additionally,our novelty is to introduce masked mechanism into SPP that encourage the secondary discriminative regions to be used to train and recognize.Besides,our competitive masked spatial pyramid pooling loss function dynamically chooses the pyramid to optimize,which increase the efficiency of masking and training.Our model reaches62.8% mIoU in PASCAL VOC 2012 benchmark,relatively gains about 1% over state-ofthe-art that via weakly-supervised learning,and surpasses simple supervised method.At the same time,deep learning hasn't break away from the constrain of extreme specificity that leads it difficult to transfer.In medical image processing,the impact of prior of data may larger than model.This thesis take melanoma segmentation as an example to explain the importance of data priors.Knowing that melanoma has the prior of central integral property,we design our image augmentation policy.Further,our CNN upsamples feature map resolution to emphasize the spatial connection of prediction,which outputs simply connected regions.In that way,our model performs much better than ResNet-38 that the model size is much bigger,and 1.5% IoU higher than the competition winner in 2017.Moreover,for the lack of structural information,melanoma segmentation results aren't able to be promoted through conditional random field(CRF).All in all,weakly-supervised learning and the combination of prior and model are hot topics of general artificial intelligence in the future,they have long-term meanings.Towards the two aspects,this thesis investigates that masked spatial pyramid pooling and deep up-sample CNN are effective by plentiful and reliable experiments.
Keywords/Search Tags:deep learning, image semantic segmentation, weakly-supervised learning, spatial pyramid pooling, medical image processing
PDF Full Text Request
Related items