Font Size: a A A

Research On Image Semantic Segmentation Method Based On Codec Structur

Posted on:2024-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:S S LiFull Text:PDF
GTID:2568307130958509Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence technology,semantic segmentation has become a research hotspot in the field of vision.Since there are many interfering factors in real scenes,such as objects being occluded,segmentation targets of different sizes,and highly similar objects,there is room for further improvement in the accuracy of image segmentation.In the image segmentation task,since expanding the perceptual field can better understand the differences between different objects,this paper conducts an in-depth study of image semantic segmentation methods around how to effectively expand the perceptual field,and the research points of this paper are as follows:(1)In the field of semantic segmentation,the segmentation of similar objects can easily cause the category confusion problem,which is due to the association of semantic information contained in deep-level features.In this paper,we propose two methods to capture the semantic information of deep-level features.In the first method,a double-branch spatial pyramid module(DSPM)is designed to pool the deep-level features into multiple regions of different sizes by cascaded atrous convolution operations,and the pooling results of the two branches are connected,which can retain more spatial information.In the second approach,in order to compensate for the "grid effect" of atrous convolution,the dense pyramid attention module(DPAM)is proposed to obtain multi-scale long-range contextual information,firstly,the DPAM captures semantic information based on cascaded cavity convolution,then captures inter-pixel long-range dependencies by strip convolution,and finally,the features of different semantics are fused by densely connected feature map operations to capture richer and deeper semantic information.(2)Features at different levels contain different semantic information,and effective fusion of these features can improve the performance of the model.Features at different levels contain different feature representations,therefore,when fusing features at different levels,some weighting or filtering of the features is required to avoid the influence of useless features on the network.First,this paper designs a Focused Selective Fusion Module(FSFM),which generates spatial weight information for each feature map using spatial masks and converts the input features from spatial domain to frequency domain to generate frequency weights and refuses the features at different levels by adaptive weighting of the above weights.Then,this paper proposes the enhanced fusion module(EFM),which captures global information from the deep feature channel dimension and judges the similarity with the upsampled features to expand the global perceptual field of deep features,and EFM also filters the noise of low-level features to obtain a high response region in the spatial dimension.(3)Based on the first and second research points,two Encoder-Decoder structure-based image semantic segmentation networks are proposed: multi-scale feature-enhanced adaptive fusion network(MFEAFN)and attention-guided filtering with optimized feature network(FRFN).To apply the segmentation model proposed in this paper to realistic street scene segmentation,a codec structure-based road scene segmentation visualization tool is also designed to predict the accurate pixel values for different objects in the street scene and thus segment the corresponding class objects.The front-end of the road scene segmentation tool is developed in HTML5 language and uses Vue front-end technology,while the back end is designed based on Flask framework,loading the model weights trained in Chapter 3 and Chapter 4 by load_state_dict method,and using Flask RESTful API to realize the front and back data interaction.
Keywords/Search Tags:Artificial Intelligence, Semantic Segmentation, Spatial Pyramid Module, Scene Segmentation
PDF Full Text Request
Related items