Font Size: a A A

Spatial And Temporal Context Analysis Of Complex Behaviors In Deep Networks

Posted on:2020-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y N WuFull Text:PDF
GTID:2428330575496933Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The spatial-temporal context is widely used in object detection and activity recognition as a kind of important clue.Spatial context can be used to model the relationships between objects in a cluttered scene.Besides,we can model the motion and temporal information through temporal context when an activity occurs.We construct a deep network which implements spatial and temporal context to capture spatial-temporal relationships between objects and scenes accurately and densely.Modeling spatial and temporal context improves the accuracy of object detection and activity recognition,and has important applications in intelligent cities,video understanding,human-computer interaction and so on.With the rapid development of deep learning,the current detection and recognition networks have been able to complete the corresponding visual tasks well.However,existing detection and recognition networks often ignore the significance of spatial and temporal contextual cues,which leads to the neglect of spatial-temporal information.There are lots of problems to deal with: First,aiming at the problem of object detection,ignoring spatial context information will not be able to model the relationship between objects and scenes,which will result in confusion of the foreground and the background.Second,relying on existing models cannot simulate the temporal context when activities occur for activity recognition.It leads to the loss of activities motion and temporal information and increases the difficulty of activity recognition.Combining the existing of object detection and activity recognition,we constructed a deep network based on spatial-temporal context by analyzing the description method of context.It solved the following problems to a certain extent: First,how to describe the spatial context? What kind of information should be extracted for modeling spatial contexts? And how the spatial context cues are integrated with the deep network to be suitable for object detection? Second,how to describe the temporal context? How to simulate temporal information at different temporal context scale? And how the temporal contexts are integrated with deep networks for activity recognition? In response to the all these issues,this thesis mainly focus on the following work:(1)We analyze the description method of the current spatial and temporal context,and focus on the current object detection and activity recognition models.We explore the theoretical basis of the current advanced detection and recognition network in detail.We are interested in their model structure which has both strengths and weaknesses.(2)We describe spatial context by co-occurrence semantic information and relative positional information between object classes to facilitate mining of spatial information in pictures.Thus,we construct a deep spatial context network that fuse spatial contextual clues to capture spatial relationships between object categories.The network affects the determination of the test results in order to perform object detection more efficiently and accurately.(3)We use 3D convolution and temporal transition layers to simulate the temporal information of activities in videos.By inserting temporal transition layer between the residual modules,we construct a deep temporal context network with 3D structure.The model can not only capture temporal information from motion features but also temporal information at different temporal context scale.Therefore,it is able to constrain the predicted category labels and achieve superior activity recognition performance.
Keywords/Search Tags:Object detection, Activity recognition, Spatial-temporal context, Convolutional network, Residual network
PDF Full Text Request
Related items