Font Size: a A A

Visual Object Recognition Algorithm Based On Deep Convolutional Neural Networks

Posted on:2018-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:M L SunFull Text:PDF
GTID:2348330542481060Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Along with the growing popularity of mobile smart devices,there is an urgent need to apply artificial intelligence technology to the real products.In addition,visual object recognition is one of the core basic researches of artificial intelligence.In fact,learning and matching visual objects is difficult for a number of reasons,such as variable viewpoint,scale,illumination or complicated background that may clutter the object.Therefore designing a robust,accurate and efficient object recognition algorithm remains a challenging task for academia research and industry applications.In recent years,deep learning algorithm is rising gradually;especially the very deep convolutional neural network has demonstrated breakthrough performance in kinds of visual tasks.The breakthrough of the object recognition drives more and more researchers devote themselves to the study of deep learning.Visual object recognition is a classification problem which is aimed at judging whether there is a certain object in the image.Three main methods are adopted,including data preprocessing,feature extraction and classifier prediction.The key point of object recognition system lies in the feature extractor,and the very deep convolutional neural network is one of the most popular feature extraction algorithms in recent years.Further studying the above algorithm details,we found that the deeper,the wider.The wider convolutional layer can't effectively remain large difference between the feature maps,thereby influencing the discrimination ability of the output feature.In view of this,the proposed method is implemented by applying unshared convolution across the channel dimension and applying shared convolution across the spatial dimension in some computational layers.Typically,hand-crafted pooling operations are used to aggregate information within a region,but they are not guaranteed to minimize the training error.To overcome this drawback,we proposed a learning pooling operation in which one shared linear combination of the neurons in the region is learned for each feature channel.In addition,inspired by the learning pooling operation,we propose one simplified convolution operation to replace the traditional convolution which consumes many extra parameters.The simplified convolution greatly reduces the number of parameters while maintaining comparable performance.The main contributions of this paper are as follows:(1)we proposed to apply unshared convolution across the channel dimension and apply shared convolution across the spatial dimension in the traditional convolutional layer.(2)To overcome the drawback of the hand-crafted pooling,a self-learning pooling operation was proposed.Additionally,inspired by it,we proposed one simplified convolution operation.(3)It has been demonstrated that our approaches outperform the state-of-the-art of deep convolutional neural networks in cifar10 and cifar100 dataset.
Keywords/Search Tags:Convolutional Neural Networks, Deep learning, Visual Object Recognition
PDF Full Text Request
Related items