Font Size: a A A

The Research On Image Semantic Segmentation Algorithm Based On Self-attention Mechanism

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:M Z LiuFull Text:PDF
GTID:2428330620476716Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation algorithm is one of the key technologies in the field ofcomputer vision.It is to predict each pixel on the image and infer its label,so as to label and segment different types of pixels in the image.Depends on deep learning and full convolution neural network,image semantic segmentation algorithm has been developed rapidly.Compared with traditional image segmentation algorithm,its segmentation accuracy and speed have been greatly improved,which can meet the actual needs of some complex scenes.At present,it is widely used in many fields,such as unmanned driving,medical diagnosis,machine navigation and so on.In this thesis,the development process and basic principle of convolutional neural network and image semantic segmentation algorithm are introduced in detail.Among them,Deeplabv3+ algorithm is one of the algorithms with high segmentation accuracy.In this thesis,we mainly study the model structure of the algorithm,finds out its shortcomings and puts forward the improvement scheme,which is verified by experiments.Firstly,aiming at the problem that Deeplabv3+ algorithm is easy to lose local details when obtaining the global image information,we propose a new image semantic segmentation algorithm Dv3P-SA,which combines self-attention mechanism with Deeplabv3+ algorithm.By calculating the relationship between any two pixels,the self-attention mechanism obtains dense pixel level context information without losing the local details of the image,which makes up for the weak recognition ability of the original Deeplabv3+ algorithm for two related pixels far away from the image,and enhances the representation ability of the algorithm.Secondly,aiming at the problem that Deeplabv3+ algorithm does not learn enough detail information of individual class,which leads to the low segmentation accuracy of this class,we propose a method combining multi classification model with multiple binary classification model.Among them,multiple binary classification models are decomposed by multi classification tasks and trained separately.The fusion model can combine the advantages of the two,not only can obtain the global information through multi classification model,but also can optimize the details of each category by using the two classification model,and finally improve the accuracy of image semantic segmentation.In this thesis,we mainly use the Pascal VOC 2012 dataset for experiments.The experimental results show that the segmentation accuracy of DV3P-SA and fusion model are significantly improved compared with the original Deeplabv3+ algorithm.At the same time,comparative experiments on cityscapes dataset are also conducted to further prove the effectiveness and versatility of the improved algorithm.
Keywords/Search Tags:Convolutional Neural Network, Semantic Segmentation, Self-Attention Mechanism, Binary classification
PDF Full Text Request
Related items