Font Size: a A A

Research Of Efficient Semantic Segmentation Methods For Scene Perception

Posted on:2021-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:W X TuFull Text:PDF
GTID:2518306122474644Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Scene perception is one of the essential tasks in the disciplines of computer science,intelligent science and robotics.It plays an important role in the fields of automatic driving,human-computer interaction,satellite remote sensing and other fields.Semantic segmentation aims to segment specific categories of the scene in a pixel-level manner,and then densely assign per-pixel to a corresponding predefined category.Due to the advantages of rich semantics,precise positioning,and intuitive effects,semantic segmentation has become one of the main solutions for scene perception tasks.However,current semantic segmentation methods for complex scenes still suffer from inefficient computation,large amounts of parameters,and low inference speed in big data environments.From the perspective of the development trend of scientific study and the daily practical demands of users,how to precisely and efficiently achieve the object identification,localization and scene parsing by semantic segmentation technology under resource-constrained environments gradually becomes one research focus in scene perception tasks,which has important scientific value and practical significance.Therefore,this paper focuses on improving the comprehensive performance of semantic segmentation methods.We design several semantic segmentation models based on relevant theories and technologies for complex scene perception.The proposed methods could achieve a better overall performance than the main existing schemes in terms of segmentation accuracy,inference speed,network parameters,and computational complexity.The content of this paper mainly includes:1.We propose a semantic segmentation method called Dense Connection and Attention-based Network(DCANet).Firstly,the Hybrid Dilation-based Dense Block(HDDB)is designed to extract dense semantic information.Moreover,the Attention-based Multi-scale Module(AMM)is presented to jointly encode global context and multi-scale local contexts.Meanwhile,the multiple context information is transformed into a set of weight attention vectors,which are transferred back to the original input for feature refinement to generate the final output.2.We propose a semantic segmentation method called Context-aware and Feature Fusion Network(CFFNet),which mainly consists of Enhanced Atrous Spatial Pyramid Pooling(EASPP)module and Lightweight Decomposed Residual Block(LDRB).Firstly,we add a group of factorized convolutions with different kernel sizesinto the original spatial pyramid pooling module to improve the effectiveness of the context encoding.Secondly,we design the LDRB to enhance the feature interaction between multi-level features with small overheads.Finally,multi-scale context information and spatial details are fused to obtain the final output.3.We propose a semantic segmentation method called Context-Integrated and Feature-Refined Network(CIFRe Net),which mainly consists of Long-skip Refinement Module(LRM)and Multi-scale Context Integration Module(MCIM).First,we establish a long-skip connection with channel attention mechanism to provide a highway and a proper guidance for low-level information learning.Then we design the Dense Semantic Pyramid(DSP)block capture multi-perspective dense context information near the target.Three DSP blocks are stacked in a cascade manner to decrease the computation cost and enlarge the receptive field.We demonstrate the efficiency and effectiveness of our proposed methods on Cityscapes,Cam Vid,and Helen benchmark datasets.Specifically,DCANet outperforms FCN and Seg Net in regard to accuracy and efficiency.Further,CFFNet obtains a better trade-off between accuracy and efficiency compared with FCN and Deep Lab V2.In contrast,the proposed CIFRe Net achieves an overall performance improvement in terms of segmentation accuracy,inference speed,network parameters,and computational complexity compared with main existing semantic segmentation methods.
Keywords/Search Tags:scene perception, semantic segmentation, multi-level feature fusion, multi-scale context, attention mechanism
PDF Full Text Request
Related items