Image Semantic Segmentation Method Based On Attention Mechanism

Posted on:2022-02-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Huang

Full Text:PDF

GTID:2518306737956379

Subject:Control Science and Engineering

Abstract/Summary:

In the information age,computer vision technology solves the important problem of computer understanding the real world.As the link between computer and the real world,image semantic segmentation plays a crucial role.At present,the semantic segmentation technology in visual autonomous navigation system,intelligent security monitoring system,medical imaging diagnosis,unmanned aerial vehicle have important application in such fields as,despite the semantic segmentation can meet the requirements of industrial,to some extent but in the presence of such as receptive field fixed of inconsistency and prediction of target object,in order to improve the overall performance of semantic segmentation method and semantic consistency,in this paper,based on the backbone Res Net,the main research work is as follows:1.The limitations of the multi-scale fusion algorithm are discussed,the multi-scale information acquisition is analyzed,and the method of position attention mechanism combined with bi-linear interpolation under-sampling is proposed for the multi-scale information acquisition of a given feature map.By using the joint position attention module as the switch to further extract the feature information,the corresponding feature information of feature maps of different scales is collected to improve the accuracy and accuracy of image semantic segmentation,to further design a semantic segmentation network with better robustness.2.Aiming at the problem that some feature information in the skip-layer cannot be fully used,the paper proposes to use channel attention mechanism to complete the fusion of shallow and deep information.By concatenating the shallow and deep information at the upper level of the channel,using global average pooling,nonlinear activation function Sigmoid and other operations,the guidance of deep feature information to shallow feature information is completed,and finally the two are integrated.In the skip-layer,setting a channel attention extraction process is conducive to better integration of the deep layer and the shallow layer,while learning the relevance between pixels.3.To solve the problems of excessive computing in attention learning in computer vision tasks,this paper proposes to replace the calculation process of similarity matrix in attention mechanism by pooling,to reduce the computational complexity as much as possible.Average,maximum and stochastic pooling are used to obtain different feature in channel and position dimensions,and the discriminative feature is extracted effectively with lower computational complexity.The proposed method and modules are applied to the backbone Res Net and the experimental evaluation and analysis are carried out on the dataset PASCAL VOC 2012 and Cityscapes.The semantic segmentation network using the above method has a higher prediction accuracy.

Keywords/Search Tags:

Semantic Segmentation, Multi-Scale Feature Fusion, Attention Mechanism, Skip-Layer, Semantic Consistency

Related items

1	Research On Semantic Segmentation-Oriented Attention Mechanism And Multi-Scale Feature Cross-Layer Fusion
2	Research On Semantic Segmentation Technology Based On Scene Analysis
3	Research On Semantic Segmentation Techniques Based On Enhanced Attention Mechanism
4	Multi-task Semantic Segmentation Method Based On Attention And Feature Fusion
5	Research On Image Semantic Segmentation Algorithm Based On Encoder-decoder Structure
6	Research On Real-Time Semantic Segmentation Method Based On Feature Fusion
7	Research On Low Illumination Semantic Segmentation Method Based On Attention Feature Alignment And Domain Adaptation
8	Image Semantic Segmentation Based On Multi-level Feature Fusion And Attention Mechanism
9	Research On Image Semantic Segmentation Algorithm Based On Feature Enhancement
10	Research Of Efficient Semantic Segmentation Methods For Scene Perception