Research On Improved Convolutional Neural Network Architecture And Its Applications In Image Semantic Segmentations | Posted on:2018-03-30 | Degree:Doctor | Type:Dissertation | Institution:University | Candidate:Robail Yasrab | Full Text:PDF | GTID:1318330512485620 | Subject:Computer software and theory | Abstract/Summary: | PDF Full Text Request | Over the past few years,Convolutional Neural Networks(CNNs)have shown an outstanding performance and reliable results on object detection,which are beneficial in real-world applications.A tremendous progress has been achieved in applying CNNs to computer vision.CNNs have shown state-of-the-art performance and developments in several technology domains,especially image classification and object recognition.The success of CNNs lead to their productive implementation in a number of real-life applications.From biometric systems to real-time applications,every area is heavily influenced by Deep Neural Networks(DNNs).Convolution neural networks are one of the key technology tools that make classification and learning much easier and more feasible.Application of CNNs for the object recognition applications offer a tremendous performance with state-of-the-art results.Backed by Graphic-Processing-Units(GPUs)technology,the CNN systems are turned out to be a preferred instrument for the vision based applications.Whereas,CNNs are very much computationally expensive systems.A CNN based system demands extensive memory and computational resources.Running CNNs on traditional CPUs requires too much time that makes CNN unsuitable for CPU training.Therefore,it seems practically impossible to implement very efficient CNNs for a real-time environment due to their limited storage and processing abilities.In this situation,there is a need for some smart CNN solutions which can offer simpler structure,improved performance and better accuracy.This dissertation focuses on two core topics:proposing a novel CNNs architecture with state-of-the-art results and reducing computational requirements of the traditional CNN architecture.The main work and contributions are as follows:1)Scalable architecture turned out to be a key requirement for current vision based applications.This dissertation proposes theory and design of scalable architectures for real-time applications of neural networks.We have resorted to convolution neural networks to design resource-efficient vision based systems.2)Road scene understanding for Advanced Driver Assistance System(ADAS)technology is an emerging research area.Though,it lacks a considerable amount of training data.To address this issue,this dissertation proposes a Decoupled Convolution Neural Network(DCNN).This architecture is aimed to train a CNN with low or semi-annotated data.The proposed network makes use of the heterogeneous annotations using a small amount of strong annotations and a huge amount of weak annotations.3)This dissertation will present the idea of a simplified and novel fully CNN architecture for semantic pixel-wise segmentation.It is different from traditional CNN pipelines.It uses only convolution layers with no pooling layer.The key objective of this model is to offer a more simplified CNN model with equal benchmark performance and results.4)CNNs for real-time environments often suffer from a number of technology bottlenecks.First,CNNs are often over-parametrized;second,there is a large amount of redundancy in network models.This dissertation proposes a novel resource efficient semantic segmentation model for probabilistic pixel-wise segmentation,which is able to predict pixel-wise class labels of a given input image.The proposed CNN network is an encoder-decoder model which is built on convolutional encoder layers adopted from the VGG-16 net whereas decoder is inspired from SegNet.The proposed CNN model is intended for Road-Scene-Understanding and could be a suitable component for video based ADAS.5)Compressing CNN could be a suitable solution to implement such systems with lower storage and processing requirements.In this scenario,this dissertation presents experiments with a variety of network architectures.The key idea is to reduce the overall storage and computational requirements.Thus,a binarized segmentation network can considerably cut-down the processing and storage requirements.The proposed network will exclude the key multiplication operations for CNN training and replace it with more computational friendly operations(addition,subtraction).This binarization procedure will lead to improved performance and results.The key objective of this research is to propose an efficient CNN architecture for real-time road scene understanding environment.The proposed network architectures will be trained over well-known CamVid and Pascle-Voc12 datasets.We have experimented with different architectures to design an efficient CNN architecture.The proposed networks offer a significant improvement in performance in segmentation results while reducing the number of trainable parameters.Moreover,there is a considerable improvement in performance in comparison to the benchmark results over PASCAL VOC-12 and the CamVid.The proposed network architectures are also available on www.github.com/robail/. | Keywords/Search Tags: | Convolution Neural Networks, Deconvolution, Pooling, Upsampling, Dropout, Binarization | PDF Full Text Request | Related items |
| |
|