Font Size: a A A

The Research On Spatio-temporal Information In Crowds Behavior Understanding

Posted on:2018-10-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y K LiFull Text:PDF
GTID:1368330542465724Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Computer vision has paved a wide avenue in addressing the general video analy-sis endeavor.Eventually,such advantage will lead to a potential automation of vision tasks such as surveillance.In particular,grasping the motion dynamics of people in crowd,especially in dense public spaces,is an essential step to face many vision chal-lenges such as crowd traffic management,or carrying out preventive security measures.Thanks to the advent of potent processing facilities,it is now hopeful that modeling human behavior could be within reach.In this respect,several topics have been tackled recentl.One emerging direction in this regard is crowd motion pattern understanding in indoor/outdoor spaces.It is deemed a rather complicated task due to the variability of motion flows undertaken by different individuals,especially in dense settings where the intricacy multiplies.Precisely,motion trends of an individual are consistent with and often dependent upon his/her spatial context in a given dense crowd setting.This suggests that understanding the behavior in videos(e.g.,walking direction)of a person can be modeled by studying his/her spatio-temporal context.To deal with the above two challenges,we propose the followings:(1)Inter-Person Occlusion Handling with social interaction for online multi-pedestrian trackingInter-person occlusion handling is a critical issue in the field of tracking,and it has been extensively researched.Several state-ofthe-art methods have been pro-posed,such as focusing on the appearance of the targets or utilizing knowledge of the scene.In contrast with the approaches proposed in the literature,we propose to address this issue using a social interaction model,which allows us to explore spatio-temporal information pertaining to the targets involved in the occlusion situation.Our experimental results on PETS and TUD-Crossing dataset show promising re-sults compared with those obtained using other methods.(2)Encoding motion features for pedestrian path prediction in crowdsPedestrian path forecasting is one of the recently emerging applications in vi-sual crowd analysis and modeling.Among attempts put forth so far,only a few ones have considered the undergoing interaction between agents as a key factor in determining their walking trends in a given scene.To this end,we propose in this paper a effective and efficient framework for pedestrian path prediction in crowded scenes.First,motion features related to the target pedestrian and its nearby neigh-bors are extracted.Second,an autoencoder feature learning model is adopted to further enhance the representativeness of the extracted features.Finally,a Gaus-sian Process Regression model is utilized to infer the potential future trajectories of the target pedestrians provided their walking history in the scene.Evaluated on a challenging dataset,our method yielded promising results and outperformed traditional methods in the literature.(3)Deep joint modeling of spatio-temporal cues for crowd behavior understandingPedestrian behavior understanding in crowds is regarded as a very challenging task due to the complexity of motion dynamics co-occurring across a given scene.Despite the fact that plausible research efforts have been made recently,temporal and spatial dependencies in crowds are usually treated separately.In this context,this paper presents a deep end-to-end approach,which considers jointly the spatio-temporal information,leading to rich video encoding yet a better crowd behavior understanding.Precisely,the displacement information describing crowd motion patterns,which are extracted from tracklets/trajectories,are fed into a convolu-tional layer in order to learn the undergoing motion patterns and produce high-level representations.These representations are fed into a Long-Short Term Memory based architecture in order to encode the underlying spatio-temporal cues in one shot for the whole crowd in a given scene.We evaluate our approach on widely used large scale benchmark datasets under three critical applications,namely pedestrian path forecasting,destination estimation and crowd scene classification.The results draw a drastic gain with respect to recent trending works.
Keywords/Search Tags:Image processing, computer vision, crowds behavior understanding, multi-pedestrians tracking, occlusion handling, path forecasting, crowds scene classification
PDF Full Text Request
Related items