Font Size: a A A

Spatial-temporal Analysis And Multi-granularity Representation Based Human Detection

Posted on:2010-02-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Z LiuFull Text:PDF
GTID:1118360332957764Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Human detection is one of the most challenging and active research topics in com-puter vision and pattern recognition. This topic is attractive for the following two mainreasons: 1) Even though human detection is a special case of the general object recog-nition, the problems it faced are generic and can provide valuable reference for the othertopics. 2) The increasing demands in the practical applications, such as smart surveillancesystem, on-board driving assistance system and content based image/video managementsystem.The main challenges for human detection come from two aspects: low SNR(Signalto Noise Ratio)and weak alignment. Low SNR means that comparing with the noise,the discriminative information is very limited in the human data. Weak alignment meansthat the shape of human can not be easily normalized due to the articulation and posevariation. These two challenges lead to a huge inner-class variation of human data. Frommethodology point of view, we use spatial-temporal analysis and multi-granularity rep-resentation to deal with these two challenges. Based on these methodologies, a series ofmodels and feature extraction methods have been developed.The main contributions of the paper can be summarized as follows:(1) A contour-motion feature for robust pedestrian detection is proposed. Thespace-time contours are used as the low level representation of the pedestrian. Thenwe apply 3D distance transform to extend the 1-dimensional contour into 3-dimensionalspace. By this way, the relations between the local contours can be maintained implicitly.Further, by encapsulating the static and dynamic information by 3D Haar filters, we cangenerate the middle level pedestrian representation: contour-motion features. Then weuse boosting method to select the most representative features. Our experiments demon-strate that the proposed approach can outperform Viola's well-known pedestrian detec-tor in both detection accuracy and generalization ability. In addition, even though ourapproach is presented in pedestrian detection scenario, it has been extended to humanactivity recognition application and remarkable performance has been achieved.(2) A multi-granularity representation method for human detection is proposed,which we refer to as granularity-tunable gradients partition (GGP). The concept of gran- ularity is used to define the spatial and angular uncertainty of the line segments in theHough space. Then this uncertainty is back projected into the image space by orientation-space partitioning to achieve efficient implementation. By changing the granularity pa-rameters, the level of uncertainty can be controlled quantitatively. Therefore a familyof descriptors with versatile representation property can be generated. Specifically, thefinely granular GGP descriptors can represent the specific geometry information of theobject (the same as Edgelet); while the coarsely granular GGP descriptors can providethe statistical representation of the object (the same as histograms of oriented gradients,HOG). Moreover, the position, orientation, strength and distribution of the gradients areembedded into a unified descriptor to further improve the GGP's representation power.A cascade structured classifier is built by boosting the linear regression functions. Ex-perimental results on INRIA dataset show that the proposed method achieves comparableresults to the-state-of-the-art methods.(3) A a spatial-temporal granularity-tunable gradients partition (STGGP) descriptorfor human detection is proposed. This method extend the GGP feature into the spatial-temporal domain. Therefore, it has the merits of both spatial-temporal analysis and gran-ularity space representation. In addition, we present three methods to incorporate motioninformation with the appearance information. Specifically, in the first method, we repre-sent the human body by two channels: spatial gradients field and optical field, then bycalculating the GGP features on these two channels, we extract the appearance and mo-tion information of human body. In the second method, we extract the GGP features onthe three orthogonal planes (X ? Y , Y ? T and X ? T planes) to explore the correla-tion between the spatial and temporal axis. In the third method, we consider the humanmotions as 3D entities in the spatial-temporal domain and use the generalized planes toparse these entities. The generalized plane is defined in the 3D Hough space with explicitangular and spatial uncertainties ( granularity parameters ). By varying the granularityparameters, we can generate the granularity space representation of human motion in thespatial-temporal domain. We evaluate these methods for both human detection and activ-ity recognition on the public dataset. Experimental results show that the propose methodscan yield the comparable results as the state-of-the-art methods.(4) A nonparametric background generation method is proposed and be used as thepreprocessing step for human detection. By this means, the detection speed can be in- creased and the false alarm can be reduced effectively. We introduce a new model, namedas effect components description (ECD), to model the variation of the background, bywhich we can relate the best estimate of the background to the modes of the underly-ing distribution. Based on ECD, an effective background generation method, most re-liable background model (MEBM), is developed. The basic computational module ofthe method is an old pattern recognition procedure, the mean shift, which can be usedrecursively to find the nearest stationary point of the underlying density function. Theadvantages of this method are: first, backgrounds can be generated from image sequencewith cluttered moving objects; second, backgrounds are very clear and without blur ef-fect; third, robust to noise and small vibration. Extensive experimental results illustrateits good performance.
Keywords/Search Tags:human detection, activity recognition, contour-motion feature (CMF), spatial-temporal analysis, multi-granularity representation, granularity-tunable gradients partition (GGP), space-time granularity-tunable gradients partition (STGGP)
PDF Full Text Request
Related items