Font Size: a A A

Research And Application Of Real-Time Instance Segmentation Network Based On Deep Learning

Posted on:2021-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:C Q HuangFull Text:PDF
GTID:2428330623468521Subject:Engineering
Abstract/Summary:PDF Full Text Request
Instance Segmentation,as a comprehensive task in the field of computer vision,is a deep understanding of image scene and can identify the instances in the image at pixel level.It includes classification,location and segmentation.The accuracy of instance segmentation technology based on deep learning has made great progress.This technology has been put into practice in many fields such as automatic driving system,mobile camera module and online video playback platform rather than only in laboratory.However,the current instance segmentation model based on the deep convolution neural network usually has complex structure,deep network and slow running speed.Even under the condition of high-performance computing hardware,it can only predict about 15 pictures per second.For most application scenarios,the signal it processes is video signal,and more than 30 images are transmitted per second,so the previous models far from meet the speed requirements.To solve this problem,we study and explore the efficient real-time instance segmentation technology from the aspects of instance segmentation method and efficient model architecture.After entering the era of deep learning,there are two main research branches for instance segmentation technology.The first is based on the method of object detection,which first detects and then segments.The second is based on semantic segmentation,it distinguishes different instances by post-processing on the semantic segmentation results.The latter is more complex and is hard to be accelerated by hardware,so we choose the first method to build an efficient instance segmentation algorithm based on the object detection method.This thesis proposes a new method of instance segmentation called EISNet(Efficient Instance Segmentation Network).EISNet is based on SSD(Single Shot Multibox Detection),a single-stage object detection algorithm.A parallel prediction branch is added to its detection module to predict a composite coefficient for each anchor box.At the same time,a fully convolutional prototype generation branch is added to the back of the maximum feature map to generate a series of full-image level segmentation prototypes.Finally,the segmentation prototypes are linearly combined with the combination coefficient and a Sigmoid function is added to generate the segmentation mask.The final instance segmentation mask is cropped from the mask with the ground truth box or predicted box.The coefficient prediction branch is parallel to the original classification and location branch,and shares the same feature processing layers with them.Besides,the prototype generation branch is parallel to the original detection branch.Although the model network becomes more complex,but compared with the original detection model,its inference speed is rarely affected.The main reason is that the parallel architecture can be easily accelerated by hardware.EISNet keeps the property of high efficiency of SSD,and at the same time can predict more than 35 images(540x540)per second under the computing power condition of GTX 1080 Ti graphics card.In order to improve the robustness of features,we proposed a modified bi-directional feature pyramid network in this thesis.An extra bottom-up channel and residual connection is added to the original feature pyramid network to help with multi-scale feature fusion,which makes the fused features contain more abundant information.In addition,a more effective location loss function DIoU is used,and a semantic segmentation branch is added to the feature map of prototype branch and removed after training.These techniques are designed to help the model converge better.Finally,in order to verify the practicability of the algorithm,the proposed instance segmentation model is applied to a Command System of Fire Fighting.We use the model to analysis the video signal in the Command System of Fire Fighting to help with on-thespot investigation and people searching.
Keywords/Search Tags:deep convolutional neural network, object detection, instance segmentation, bi-directional feature pyramid network, object tracking
PDF Full Text Request
Related items