Font Size: a A A

Light-weight Convolution Network For Real-time Semantic Segmentation

Posted on:2021-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:S J ChenFull Text:PDF
GTID:2518306470462894Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Semantic segmentation is a task that performs pixel-level classification on an image.Compared with image classification or object detection,semantic segmentation require more robust ability of classification and localization.The prediction of a segmentation model directly affects the performance of post tasks like image processing or scene perception.Therefore,semantic segmentation is one of the most important computer vision tasks.At present,semantic segmentation is applied to autopilot or portrait segmentation tasks in some entertainment application.In recent years,with the development of deep learning technology,and hardware equipment,the performance of computer vision tasks has been rapidly developed.In the field of semantic segmentation,not only can the prediction be made end-to-end by segmentation model,but also the accuracy is much higher than traditional visual methods.This article uses deep learning technology based on convolutional neural network to further study the semantic segmentation task,the main task of which is to build a lightweight convolutional neural network for semantic segmentation.At present,most performance-based semantic segmentation models rely on large-scale convolutional neural networks as a feature extraction model.This method often requires a larger amount of parameters and calculations.This results in a semantic segmentation model based on convolutional neural networks that requires greater computational costs,which is not conducive to use in low-cost computing devices,such as mobile terminals and embedded devices.How to reduce the amount of parameter storage and operation cost has become a necessary issue for the implementation of semantic segmentation models in various industries.This article will explore this topic.First of all,in the research on lightweight convolutional neural networks,we investigate the current mainstream neural networks.We propose an efficient feature extraction network.By analyzing the structure of the existing classic networks,we redesigned some modules in the network to improve the efficiency of the convolutional neural network and reduce a lot of unnecessary calculations.In particular,for the structure of the classic network Dense Net,we replaced the shallow part of the network,and used the feature reuse method of element adding instead of feature concatenation.Such change reduce the amount of parameter and calculation amount brought by the expansion of the feature map;On the other hand,we propose a new two-path down-sampling method to reduce the amount of parameters;finally,we modify the pre-activation structure used by Dense Net so that after the model training is completed,the model can be further accelerated by post-processing optimization reasoning.Secondly,in the experiment on feature fusion post-processing in semantic segmentation,we analyzed the currently popular feature pyramid structure(Feature Pyramid Network)and proposed a more efficient feature recovery pyramid model(Feature Pyramid Refine Network)to enhance features Discriminative performance.Among them,we use an efficient feature multiplexing method to fix the receptive field of features in each layer by using dilated convolution with different dilated rates.Such method using in feature fusion,the feature map will detect small objects while recovering the spatial information of large objects At the same time,we use a cascade segmentation model to reuse the features of each layer in the feature fusion pyramid to ensure that the features of each layer can contribute to the final prediction.In the end,we achieve 69.7% m Io U performance on the test set of the City Scapes data set,and the running time of the entire model is 96 Frame per second.Compared with the ENet model,our performance has a 10% accuracy improvement,while the speed is greater than 24 frames per second.
Keywords/Search Tags:semantic segmentation, feature re-use, light-weight neural network, attention model
PDF Full Text Request
Related items