Font Size: a A A

Application Study Of Semantic Segmentation Based On Attention Mechanism

Posted on:2022-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:L M CaoFull Text:PDF
GTID:2518306323467054Subject:Data Science
Abstract/Summary:PDF Full Text Request
Semantic segmentation is an important and basic task in computer vision,which aims to provide pixel level classification prediction.As a basic visual perception task,semantic segmentation has a wide range of applications,for example:automatic driving cars need semantic segmentation and point cloud segmentation to provide visual perception scheme;precision agriculture can use semantic segmentation technology to automatically identify irrigation and water conservancy facilities through cameras;with the help of semantic segmentation technology,doctors can quickly analyze MRI images,X-ray images and other examination results;In the field of e-commerce,semantic segmentation of clothing accessories images can assist intelligent recommendation.Due to rise of convolutional neural network,the performance of semantic segmentation models has been greatly promoted.At present,the mainstream of semantic segmentation network is based on CNN.In the continuous exploration of studies,researchers gradually realized some very important technologies in semantic segmentation network:FCN,atrous convolution,multi-scale operation,feature reusing and so on.With the rapid development of natural language processing technology,the importance of attention mechanism has been gradually discovered and transferred to other tasks,which benefits semantic segmentation.Many excellent semantic segmentation networks adopt attention mechanism,which can be roughly divided into channel attention and spatial attention.Many effective networks consider dimensional reduction and compression of attention mechanism to reduce the computational burden and adapt to image tasks.In the spatial attention mechanism,the core is to calculate the affinity matrix,which is used to re-weight the feature vectors.In the field of image processing and NLP,affinity matrix is mostly embedded in the network model and used as weight.In this dissertation,affinity matrix is directly extracted from the network,and used as an independent module for semantic segmentation network to enhance the performance of the network.Firstly,a new multi-scale label affinity matrix is proposed,and then a new square root dot-product kernel operation is defined to calculate the multi-scale affinity matrix of score map.Combining the two,a new penalty function,affinity regression loss(AR loss),is proposed.Such a penalty can be regarded as a binary form of structural supervision,which is used to assist the training of cross entropy loss.In this dissertation,the loss of affinity regression is analyzed and demonstrated mathematically,and some mathematical properties of affinity matrix in neural network are shown.In this dissertation,extensive experiments are carried out on public datasets such as NYUv2 and Cityscapes to show the effectiveness and efficiency of affinity regression loss in improving the performance of semantic segmentation network.In addition,experiments show that the affinity regression loss function can also be used in the multi-scale supervision strategies.Finally,compared with the similar binary-form penalty model,experiments show that the proposed model is competitive in computational efficiency.
Keywords/Search Tags:Deep Learning, Convolutional Neural Network, Semantic Segmentation, Attention, Affinity
PDF Full Text Request
Related items