Font Size: a A A

Research On Semantic Segmentation Networks Based On Multi-feature Attention

Posted on:2022-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhaoFull Text:PDF
GTID:2518306323951239Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the task of image semantic segmentation,one of the difficulties is how to achieve an efficient balance between the computational complexity and the segmentation accuracy in convolutional neural network.In order to pursue high segmentation accuracy,a deep network is usually used as the backbone network,such as Res Net50 and Xception65,and at the same time,small-step down-sampling is used to preserve detailed information as much as possible to avoid excessive loss of detailed information.The disadvantage is that the amount of parameters and calculations has increased dramatically,and the computational complexity of the network is large.The representative algorithm is Deep Lab V3+ proposed by Google.In order to pursue low computational complexity and make the network as real-time as possible,lightweight networks are usually used as backbone networks,such as Shuffle Net V2 and MobileNet V2.The disadvantage is that the network segmentation accuracy is limited.In order to reduce the cost of network training and greatly reduce the computational complexity while maintaining high precision,this paper proposes a multi-feature attention effective aggregation module(MAEA).The MAEA module selects low-level features with different resolutions extracted at different stages in the backbone network,and these multi-features retain the detailed information extracted at the corresponding stages.MAEA processes multiple features through a spatial attention mechanism to generate semantic-level attention,so that the model pays attention to the feature space area of interest in the feature map,and learns the semantic flow between adjacent multiple features from low resolution to high resolution.The semantic level attention of multifeatures is aggregated to highlight important feature domains.It obtain high-resolution features with powerful semantic representation,and provide enough important detail information for the decoder to achieve high-precision segmentation.Aiming at the balance between the segmentation accuracy and computational complexity of the semantic segmentation network,this paper proposes the MAEADeep Lab deep network based on the effective multi-feature attention aggregation module.MAEA-Deep Lab slightly modified the Xception65 network,using a 16-step downsampled Xception65 network variant as the encoder backbone network.Extraction smallresolution features with large-step down sampling will cause the loss of too much detailed information and affect the accuracy of segmentation.However,after the MAEA module is embedded in the decoder,sufficient detail information can be restored to achieve highprecision segmentation.At the same time,the extraction of low-resolution advanced features greatly reduces the amount of parameters and calculations of the network,and realizes the balance between high segmentation accuracy and low computational complexity of the network.In order to verify the effectiveness of the multi-feature attention semantic segmentation network,this paper conducted a semantic segmentation benchmark test on the PASCAL VOC 2012 dataset and the Cityscapes dataset.MAEADeep Lab does not undergo pre-training on the COCO data set and only uses two RTX2080 ti GPUs.On the PASCAL VOC 2012 test set and the Cityscapes test set,the m IOU scores reached 87.5% and 79.9% respectively.MAEA-Deep Lab's parameter volume is78.9M,floating-point calculation FLOPs are 395.8G,calculation volume is only 30.9%of Deep Lab V3+ architecture.Not only the computational complexity is greatly reduced,but also the segmentation accuracy is almost equal to Deep Lab V3+.Because only two RTX 2080 ti GPUs are used in this paper,the training of the network on Cityscapes large resolution image data set is not enough,and there is a lack of accuracy.It shows that the computational complexity of MAEA-Deep Lab network is still larger than that of lightweight network.Aiming at the problem that the MAEA-Deep Lab network is not lightweight enough,this paper proposes the MAEA-MobileNet lightweight semantic segmentation network based on the effective multi-feature attention aggregation module.The parameters of MAEA-MobileNet are 5.0M,and the amount of floating-point operations FLOPs is 44 G.On the test sets of PASCAL VOC 2012 and Cityscapes,the m IOU scores reached 78.1%and 74.3% respectively.The network has a high segmentation accuracy while achieving Lightweight.MAEA-MobileNet performs better than most mainstream lightweight semantic segmentation networks such as Lite Seg,and can be deployed to mobile terminals with real-time application requirements.It achieves a balance between segmentation accuracy and computational complexity.
Keywords/Search Tags:Semantic segmentation, Encoder-Decoder, MAEA, Spatial attention
PDF Full Text Request
Related items