Font Size: a A A

Research On The Algorithms Of Semantic Segmentation And Object Detection Based On Contextual Aggregation

Posted on:2021-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:D C CongFull Text:PDF
GTID:2428330614465673Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Semantic Segmentation is one of the classic tasks in the field of computer vision.It belongs to high-level visual tasks and plays an extremely important role in image understanding.Semantic Segmentation belongs to the problem of dense pixel classification,which aims to classify every pixel in the image accurately.Object detection is another classic task in the field of computer vision.Compared with semantic segmentation,object detection belongs to the middle level task of computer vision.Its purpose is to classify the objects in the image and find the corresponding bounding box.Semantic segmentation and Object detection are all composed of two sub tasks:classification and location.In recent years,deep learning has developed rapidly in the field of computer vision.As the most important part of deep learning,more and more researchers focus on convolutional neural network(CNN).Compared with the traditional image processing algorithm,CNN can extract image features efficiently.Based on this characteristic,CNN also provides a new research direction for semantic segmentation and object detection tasks.At present,most of the CNNs are designed for object classification,which can not be directly used to solve the task of semantic segmentation or object detection.The deep layer of CNN can extract semantic information very well.Although these semantic information is beneficial to object classification,it lacks a lot of location information;on the contrary,the features extracted from the shallow layer have rich location information,but lack of semantic information.Based on these rearch findings,this thesis mainly carries out the following research:(1)This thesis presents a general semantic segmentation framework-Context Aggregation Network(CAN),which is composed of backbone network and context integration network.It can use the context information of CNN to solve the semantic segmentation task(using the shallow location information to solve the location task,using the deep semantic information to solve the classification task).CAN proposes the context convolutional unit(CCU)to extract and refine the information in the layers of backbone network,and then fuses the high-level semantic information and the low-level location information through the multi-resolution fusion block,and finally produces more accurate semantic segmentation output through the output convolution.In addition,we also use the end-to-end method to train CAN,which is helpful to improve the performance of semantic segmentation algorithm.(2)In order to achieve a better balance between the classification and location subtasks insemantic segmentation and solve the problem of insufficient feature mapping in most semantic segmentation architectures,this thesis proposes Bi-directional Context Aggregation Network(Bi CANet).It is composed of backbone network,Contextual Condensed Projection Block(CCPB),Bi-directional Contextual Interaction Block(BCIB),Channel Attention Block(CAB)and Multi-scale Context Fusion Block(MCFB).In the architecture of Bi CANet,the pooling layer in the backbone network,which is harmful to semantic segmentation,is removed.And we design CCPB to refine and extract the features of backbone network preferably.In order to integrate and utilize the context information in the backbone network,Bi CANet proposes BCIB to better fuse the shallow position information and deep semantic information after refining and extracting,and then the fused features are filtered through the CAB in the dimension of channel.At last,Bi CANet proposes MCFB to generate better semantic score map,so as to obtain better accuracy of semantic segmentation.(3)The task of object detection can not solve the sub tasks of classification and location simultaneously.In order to solve this problem of object detection task and verify that the optimization method for semantic segmentation task proposed in this thesis is also effective in object detection task,we propose Single Shot Contextual Aggregation Network for Object Detection(CADet).Qualitative and quantitative experimental results both certificate that the performance of proposed semantic segmentation algorithm on CITYSCAPES,PASCAL VOC2012 and ADE20 k datasets has reached State Of The Art(SOTA).And the object detection algorithm proposed in this thesis uses the idea of semantic segmentation algorithm proposed in this thesis for reference.It proves that the object detection algorithm which proposed in this thesis has certain advantages on PASCAL VOC2012 dataset,and the optimization idea for semantic segmentation task proposed in this thesis has a certain generality for solving similar problems in object detection task.
Keywords/Search Tags:Deep Learning, Convolutional Neural Network, Semantic Segmentation, Object Detection, Context Aggregation Network, Bi-directional Context Aggregation Network, Single Shot Contextual Aggregation Network for Object Detection
PDF Full Text Request
Related items