Scene Understanding With Convolutional Neural Networks

Posted on:2016-12-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Wu

Full Text:PDF

GTID:2308330503956367

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Scene understanding is one of the most important problems in Computer Vision,which can be widely applied in robot navigation, driver assistant systems, environmental monitoring system, content-based image retrieval etc. It is also a fundamental theoretical problem in computer vision which aims to teach machines to understand images like human beings, and it has still not been solved yet. One of the difficulties of scene understanding is to generate translation-invariant, rotation-invariant, scale-invariant semantic features. Inspired by visual perception systems, Convolutional Neural Networks(CNNs)can learn distinguished features in supervised learning from plenty of samples and has achieved great success in some problems of Computer Vision, Speech Recognition and etc. But As feature extractors, to choose appropriate models and parameters of Convolutional Neural Networks is time consuming. Besides, although Convolutional Neural Networks do have translation invariance in local regions, they are not robust in scale invariance. And the features learnt especially in higher layers can not be explained well.We used Convolutional Neural Networks as feature extractors to address the problem of scene understanding, especially the targets of specific traffic sign detection and scene labeling. We did some researches about Convolutional Neural Networks with multi-scale information. In specific tasks, we did some researches about the choices of models and training parameters of CNNs.In realtime object detection problems, we need to get bounding boxes instantly. Besides, Convolutional Neural Networks demand high precisions about the locations and scales of bounding boxes. In traffic sigh detection, we transformed RGB images to grayscale images with SVM, followed by Convolutional Neural Networks with fixed filters applied in different scales of images to get bounding boxes. We used multi-stage Convolutional Neural Networks to recognize traffic sign classes. In German Traffic Sign Detection Benchmarks, we ranked 2nd place in Class “Mandatory”, and ranked 3rd place in Class “Danger”.In scene labeling, we used multi-scale Convolutional Neural Networks to label each pixel. Multi-scale Convolutional Neural Networks utilized different scales of context information. After coarse labeling of Convolutional Neural Networks, a Fully-connected Conditional Random Field corrected some labeling mistakes. We obtained 79% pixel average accuracy with a speed of 2 seconds per image in Stanford Background Data.

Keywords/Search Tags:

convolutional neural networks, scene understanding, traffic sign detection, scene labeling

PDF Full Text Request

Related items

1	Research On Outdoor Scene Understanding Using Deep Convolutional Neural Networks
2	Convolutional Neural Network Based Research On Image Understanding
3	Scene Labeling Based On Convolutional Neural Networks
4	Indoor Scene Understanding Based On Convolutional Neural Network And 3D Geometric Context Information
5	Pixel-wise Scene Understanding Based On Fully Convolutional Networks
6	RGB-D Scene Understanding And Its Optimization
7	Research On Scene Understanding Technology Of Indoor Service Robot Based On Deep Convolution Neural Networks
8	Research On Scene Understanding Algorithms Based On Graph Neural Networks
9	The Research Of Scene Understanding Neural Network Model
10	Research And Implementation On Traffic Signs Detection Based On Intersection Scene