Font Size: a A A

Image Semantic Segmentation Based On Adversarial Learning And Attention Mechanism

Posted on:2020-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:B WangFull Text:PDF
GTID:2428330578960939Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is one of the core tasks of computer vision,which is widely used in automatic driving,medical image processing,geographic information system and intelligent robots.Deep learning is an emerging technology with great potential in the field of artificial intelligence.It has made great breakthroughs in computer vision,machine translation,natural language processing and other important subjects.Compared with traditional image segmentation technology,image semantic segmentation based on deep learning can automatically complete feature engineering without manual operation.Image semantic segmentation based on deep learning pays more attention to the training of depth network,the fusion of multi-scale context information and the pursuit of better segmentation algorithm,which can better solve the problem of complex scene segmentation.Considering the above factors,image semantics segmentation based on deep learning is proposed as the research topic in this paper,and focuses on the research of adversarial learning methods,multi-scale feature fusion mechanism and attention mechanism.In practice,the generator does not simulate the detail textures of natural images,and image semantics segmentation network often has the problem of inconsistent prediction results due to small structure changes in network.In view of the above problems and the task of understanding complex scenarios,it is considered that the existing generators are not suitable for providing "real" training samples,but should be based on providing valuable label semantic information.Therefore,a two-branch semantic segmentation network TwinsAdvNet,which uses two kinds of predictive probability map to adversarial learning,is proposed.This method is different from many existing adversarial learning methods which are based on the existing adversarial learning method of a two-player minimax game between a predicted probability map generated by the generator and the ground truth.Its novelty is that the prediction results of one branch of segmented network are regarded as weak labels of another branch of semantics segmentation.Experiments on SceneParse150 dataset show that the learning method can make the original network learn better parameter distribution,alleviate the inconsistency of prediction results and improve the segmentation accuracy.However,adversarial learning method will lead to more complex network structure,increase network parameters,network training difficulties,and easy to collapse training problems.At the same time,multi-scale feature fusion is one of the important methods to improve the accuracy of image semantics segmentation.Compared with the adversarial learning,the implementation of multi-scale feature fusion is simpler and can effectively improve the segmentation accuracy.DeepLab series networks capture semantic information of different scales by several parallel atrous convolutions with different atrous rates.Considering that the contribution of different scales of feature information to the task of semantics segmentation may be different,the number of channels needed for convolution or pooling operation of each scale information may also be different.Based on DeepLabv3 network,this paper proposes to allocate different channels to different convolution or pooling operations without increasing the total number of channels in ASPP module to improve the effect of multi-scale feature fusion and segmentation results.Meanwhile,the method is easy to implement and does not increase too many parameters and calculation.Attention model in deep learning simulates human visual mechanism,which can effectively extract key information related to tasks.Aiming at the problem of weak interpretability and rough information of channel attention information extracted by the commonly used global average pooling method,a new method based on sub-region mean pooling is proposed to obtain initial channel information.On this basis,a new semantic segmentation network,called LANet,is proposed by synthesizing the previous work.Unlike most previous attention mechanisms that employ higher-level features to guide the recalibration of lower-level features,LANet can transfer the channel attention information of lower-level features to the attention distribution of higher-level features.
Keywords/Search Tags:Semantics segmentation, Deep learning, Adversarial learning, Multiscale feature fusion, Channel attention
PDF Full Text Request
Related items