Font Size: a A A

Research On Object Detection In Complex Background

Posted on:2017-01-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:T XiangFull Text:PDF
GTID:1108330485488405Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Object detection is a fundamental problem in the field of computer vision. Its main task is to find objects in test image and determine them belong to which categories according to predefined recognition model and search strategy. Although many achievements have been made until now, the performances of many detection methods decrease when the appearance of object changes due to many factors, such as illumination, view points,poses and partial occlusion. However, the demand of object detection in complex scenes highly increases along with the increasing demand of the applications, such as intelligent monitoring, intelligent traffic systems, face image search etc.The woks in this thesis are built on novel Random Forests improved by Boosting algorithm, termed GBRF(Gradient Boosting Random Forests). Considering the characteristics of objects in various complex scenes, the detection methods proposed are achieved by combining GBRF with some specific image representations. With the learned object model, objects in test image can be detected using a predefined search strategy. The research work focus on object model, image representation, fast object detection and multi-view object detection. The main contributions are summarized as following:(1) With the improved GBRF, a general framework for modeling objects with intraclass variations is proposed, and it is used for face detection. Firstly, the tree structured classifier in GBRF can split samples with selected image features layer by layer, and different decision paths represent objects with different appearances. Secondly, to improve generalization ability and classification accuracy, Bagging algorithm is adopted to assemble multiple trees, and Boosting algorithm is adopted to construct forest in layer-wise manner. According to the idea mentioned above, a novel face detection method named GBRF + Haar features is proposed. Experimental results demonstrate competitive performance for face detection. However, it is shown that powerful image representation and object model are needed for detecting objects with large intra-class variations in complex background.(2) To obtain high-level image representations, features computed by deep Convolutional Neural Networks(CNN) are used. Based on CNN features and GBRF, a novel object model which assembles multiple weak classifiers built on CNN-based local patches is proposed. Firstly, according to the mappings between CNN feature maps and input image, a CNN-based local patch descriptor is proposed, and each training image is represented by a set of local patch descriptors. Then, multiple feature split function is proposed to split samples at a node with the selected local patch. Finally, the GBRF is constructed by selecting and assembling multiple discriminative local patches in layer-wise manner.Experimental results show that CNN-based local patch descriptor is robust to illumination and local deformations, and multiple feature split function has stronger classification ability. Compared with other detection methods, proposed method demonstrates better performance.(3) Motivated by the fast calculation of Dominant Orientation Template(DOT), a novel fast pedestrian detection method based on GBRF and adaptive local DOT template is proposed. Firstly, the adaptive local DOT template is defined by binary representation, and the template matching is computed by bitwise operations. To further accelerate the calculation, SSE operations are adopted. Secondly, the local templates with different sizes and different locations are generated adaptively. Having such local templates,a novel split function defined by template matching is proposed to divide samples at a node in the forest. Finally, the GBRF is constructed by selecting and assembling multiple local DOT templates in layer-wise manner. During detection, a cascade architecture is proposed to reject a large majority of background windows as early as possible. Experimental results show that proposed method can detect pedestrians fast and accurately.Furthermore, proposed local DOT templates are robust to partial occlusions.(4) To solve multi-view car detection problem, an extension to the traditional Hough voting detection method which allows for sharing visual words among multiple viewrelated subcategories and accumulating votes with discriminative combination weights for objects in different views is proposed. Firstly, GBRF is used to cluster image patches according to appearances and offset vectors. With the clustering results, a compact definition of visual word is proposed. Having such visual words, many scattered votes are discarded and the Hough voting process can be defined by more intuitive manner.Then, a novel Hough voting method is defined by combining discriminative weights and shared visual words. The score for a hypothesis is given by the maximum vote in all subcategories. Finally, the discriminative combination weight vectors of visual words for different subcategories are learned by unsupervised sub-categorization and multi-class linear SVM. Experimental results demonstrates better performance for multi-view car detection compared with other methods, and the votes accumulated for a object center are more consistent in location.
Keywords/Search Tags:Random Forests, Object Detection, Boosting Algorithm, Convolutional Neural Network, Hough Voting
PDF Full Text Request
Related items