Hyperparameter Optimization of Deep Convolutional Neural Networks Architectures for Object Recognitio

Posted on:2019-01-20

Degree:Ph.D

Type:Dissertation

University:University of Bridgeport

Candidate:Albelwi, Saleh

Full Text:PDF

GTID:1478390017487471

Subject:Computer Science

Abstract/Summary:

Recent advances in Convolutional Neural Networks (CNNs) have obtained promising results in difficult deep learning tasks. However, the success of a CNN depends on finding an architecture to fit a given problem. A hand-crafted architecture is a challenging, time-consuming process that requires expert knowledge and effort, due to a large number of architectural design choices. In this dissertation, we present an efficient framework that automatically designs a high-performing CNN architecture for a given problem. In this framework, we introduce a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet). The new objective function allows the hyperparameters of the CNN architecture to be optimized in a way that enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet. The actual optimization of the objective function is carried out via the Nelder-Mead Method (NMM). Further, our new objective function results in much faster convergence towards a better architecture. The proposed framework has the ability to explore a CNN architecture's numerous design choices in an efficient way and also allows effective, distributed execution and synchronization via web services. Empirically, we demonstrate that the CNN architecture designed with our approach outperforms several existing approaches in terms of its error rate. Our results are also competitive with state-of-the-art results on the MNIST dataset and perform reasonably against the state-of-the-art results on CIFAR-10 and CIFAR-100 datasets. Our approach has a significant role in increasing the depth, reducing the size of strides, and constraining some convolutional layers not followed by pooling layers in order to find a CNN architecture that produces a high recognition performance.;Moreover, we evaluate the effectiveness of reducing the size of the training set on CNNs using a variety of instance selection methods to speed up the training time. We then study how these methods impact classification accuracy. Many instance selection methods require a long run-time to obtain a subset of the representative dataset, especially if the training set is large and has a high dimensionality. One example of these algorithms is Random Mutation Hill Climbing (RMHC). We improve RMHC so that it performs faster than the original algorithm with the same accuracy.

Keywords/Search Tags:

CNN, Architecture, Convolutional, Networks, Results, Objective function, Optimization

Related items

1	Research On The Effect Of Selection Of Control System Objective Function On Optimization Results Quality
2	VLSI Optimizations And Implementations For Convolutional Neural Networks
3	Research On Bi-objective Optimization Models And Algorithms For Service Function Chainings Mapping In Elastic Optical Networks
4	Research On Multi-objective Convolutional Neural Architecture Search Algorithm Based On Population Coevolution
5	A PSO-based CNN Algorithm For Keywords Selection On Google Ads
6	Research On Evolutionary Algorithms Based On Decomposition For Multi-objective And Many-objective Optimization Problems
7	Research On Multi-Objective Artificial Physics Optimization Algorithm And Its Application
8	Research On Channel Pruning Algorithm Of Convolutional Neural Networks
9	Optimal Design Of Key Operator Circuits For Convolutional Neural Networks In Image Recognition
10	PLS Algorithm And Auxiliary Function Method For Multi-objective Optimization