Font Size: a A A

Deep Learning Based Image Semantic Segmentation And Its Application

Posted on:2021-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:F ChenFull Text:PDF
GTID:2428330614966008Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image semantic information is obtained by classifying each pixel in an image with a predefined label.It is prevalent amidst autonomous driving,medical analysis,and scene understanding.Recently,deep learning has achieved outstanding performance in computer vision tasks.However,in the 2D case,semantic segmentation networks are usually less valid for their fixed geometric structure.In the 3D case,point cloud-based object detectors mainly depend on the point clusters and ignore the semantic information.In this work,for the challenges in 2D image processing,this paper is devoted into develop more flexible parts in networks.Then,in 3D LIDAR processing,this paper takes advantage of the 2D image semantic information from semantic segmentation to enhance the performance of 3D object detection.First,aiming to improve the geometric transformation modeling ability,this paper proposes a method named Dynamic Attention Network for Semantic Segmentation(DAN).This method uses deformable convolution to exploit a more powerful feature aggregation part,working for accurately acquire object-relevant content.Besides,the method organizes a fully dense connected network for semantic segmentation,which allows the gradients at each level of encoder-decoder to be transferred to the whole model in order.These two operations contribute to the dynamic attention mechanism in this model,and thereby enhance the geometric modeling ability.The proposed method achieves the state-of-the-art performance on two semantic segmentation benchmarks,outperforming existing semantic segmentation methods.Second,to further improve the geometric transformation by designing basic components,such as convolution layer,this paper proposes Adaptive Deformable Convolutional Network(A-DCN).In detail,this method reformulates existing deformable convolution by inserting an adaptive dilation factor.This factor models the relative distance between sampling locations in offset and then this distance information is sent to channel attention.This manner enables the spatial and channel attention,which is originally separated,to interact with each other.To verify the effectiveness of the proposed method,the regular convolutions in networks are replaced with adaptive deformable convolution in the methods of various computer vision tasks that obtain state-of-the-art performances.The experiments indicate that the proposed adaptive deformable convolution could further improve their original performances.Third,to exploit the developed 2D expertise to compensate for the insufficient semantic details of 3D object detector,this paper introduces a 3D detection method termed Semantic Frustum Based Sparely Embedded Convolutional Detection(SFB-SECCOND).In this work,2D semantic segmentation and object detection methods are employed to further accurate the shape and location of objects of interest in the point cloud.All potential targets are firstly detected and segmented into two confidence-related regions.Then,this accurate and discriminative object information is sent to the 3D point cloud detector.This method employs a confidence score based loss function and shows superiority over the existing state-of-the-arts on the KITTI benchmark.
Keywords/Search Tags:semantic segmentation, geometric transformation, LIDAR, 3D object detection
PDF Full Text Request
Related items