Font Size: a A A

Individual-level Instance Segmentation

Posted on:2020-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:W Q XuFull Text:PDF
GTID:2428330623463651Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Instance segmentation is a core problem in the computer vision filed.As it has high impact on the real world applications,such as to distinguish the pedestrians and the vehicles in the autonomous driving system,or the recognition ability in the scene understanding system,the instance segmentation task is thoroughly studied.The study subjective has been also transferred from the pure RGB information on the 2D images to the RGB-D information.In this paper,we will embark on the classic RGB images,but go to a further step on the RGB-D instance segmentation problem.Besides,we propose an individual-level instance segmentation task upon the research.THe mainstream instance segmentation task focus on the RGB images,since it is the most basic and widely used data format.However,solely based on the RGB,a lot of objects cannot be distinguished,such as the 500 ml drink bottle in the supermarket and the 600 ml counterpart.Generally they resmeble to each other.And even humans have trouble to tell them apart.The individual segmentation proposed in this work aims to effectively recognize such appearance-similar objects with only size difference.It has a close relation with the fine-grained recognition task,since it can also be regarded as the finest-grained recognition task.The typical fine-grained recognition task seeks to find the discriminative region,as make it as the classification basis.Though our objective is also recognizing the finegrained objects,we do not seek to find the most discriminative region,but to incorporate the depth information,which can be obtained from the depth camera or the Lidar.The depth record the distance information from the camera,but also the 3D shape of the objects,which also helps to recognize the fine-grained objects.With the spread of the depth camera and the Lidar,incorporating the depth to facilitate the instance segmentation will be on the edge.As the existing dataset seldomly support such problems,we proposed a novel pipeline,by scanning the object to 3D models,and putting them into a preset 3D scene to synthesize the reasonable layouts,rendering it to the2 D images,and then transferring the style from the CG-like to real-like by a novel GAN-based method.On top of the 3D model based approach,we also explore cut-and-paste synthesis approach with multi-view object segment patches.Current instance segmentation frameworks can be roughly divided into top-down and bottom-up approaches.The top-down approaches usually outperform the bottom-up,but as the top-down detect the object proposal first,and then segment out the instance mask from the proposal,the runtime is linearly dependent on the object number in an image.When the object number is rather large,there may be potential latency in the runtime.To address that,we propose a novel top-down instance segmentation framework,by encoding the object shape into vectors and reconstruct the shape through a tensor operation decoding.The whole process is very efficient,and it is also the first top-down instance segmentation solver,which is independent of the object number.
Keywords/Search Tags:computer vision, instance segmentation, semantic segmentation, 3D object modeling, RGB-D segmentation, generative adversarial network
PDF Full Text Request
Related items