Font Size: a A A

Research On Depth Information Based Recognition Of Behaviors With Different Complexities

Posted on:2016-06-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Z QuFull Text:PDF
GTID:1318330461953054Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Human and objects in our daily life are in the form of three-dimensional. Human can perceive objects'three-dimensional information, including the shape, texture and depth information. The traditional computer vision researches on processing and recognition in 2D color images according to the principle of human vision, as 3D objects or 3D scene are projected to two-dimensional images based on camera coordinate system. Although there are abundant research methods and results with good performance for 2D computer vision, but it is also has a bottleneck of the development since it has lost one dimensional information.2D computer vision has many constraints for recognition or classification task, such as occlusion, illumination variation or different view angle problem. The fundamental reason caused by the constraints is the loss of depth information, which is the distance from object to camera. Combined with 2D information and the depth information,3D information of object or scene can be reconstructed. Therefore, the analysis of depth information can effectively improve the computer vision research.In this paper, recognitions of behaviors with different complexities according to the depth information are analyzed. The complexity of different behaviors is defined by the way how the depth information being used. We divide depth information into three categories:two-dimensional static depth data, three-dimensional dynamic depth data in single-point and three-dimensional dynamic depth data in multi-points. The first part of this paper takes the fire and smoke behavior detection as a case study of behaviors detection using outlier depth value detection on 2D static depth data. The second part of this paper takes the authentication and recognition problems of handwriting in the air as a case study of behavior recognition based on three-dimensional dynamic depth data of single point. The third part of this paper takes human interaction behavior recognition as a case study of behavior recognition based on three-dimensional dynamic depth data of multi points. Three-dimensional dynamic depth data mainly refers to the single-point or the multi-points with valid depth value in a time series. In this paper, we employ Kinect camera as a depth camera for depth data acquisition. Three behaviors recognition problems based on depth data are introduced as a complexity order from simple to complex. The main contents and results are as follows:(1) Recognition of behaviors based on two-dimensional static depth information.Fire and smoke have their special physical properties, that showing abnormal depth value in the depth image. They can not be simply measured by depth sensor. By this characteristic, we process the smoke and fire detection based on single depth frame. The basic procedure are as follows:the depth image and RGB image are firstly calibrated based on the correlation between depth and RGB data, then smoothing and denoising are processed by building depth background model, candidate regions of fire and smoke is located after depth filtering with background model. At last, the related regions in RGB images are analyzed for fire or smoke confirmation. By our algorithm, the smoke and fire detection can be done in various illumination environments. As a breakthrough of traditional 2D image smoke detection method, our algorithm can detect the smoke in complete dark environment.(2) Recognition of behaviors based on three-dimensional dynamic depth information of single point.We take handwriting in space as a case study of human behavior recognition based on three-dimensional dynamic depth data of single point. As handwriting in space is a friendly human-computer interaction way, our research focus on two aspects about 3D handwriting: handwriting authentication and handwriting recognition. Both of them use depth data to detect and track 3D finger positions for generating three-dimensional trajectory, then the three-dimensional trajectory is preprocessed and the feature is extracted from 3D trajectory for further authentication or recognition. Specifically, authentication problem is to identify the user through the analysis of three-dimensional trajectory, determine whether the three-dimensional signature is belonging to the relevant user. The recognition problem is to recognize the content of three-dimensional trajectory. For the authentication problem, this paper proposes five attack models, let attacker imitate the 3D signature in five different ways to attack the authentication procedure. The template of each 3D signature is selected by Dynamic Time Warping(DTW) distance calculation. Using DTW distance to measure the differences between 3D handwriting to its temple can effectively distinguish real users and attackers. For handwriting recognition problem, this paper proposes two identification algorithms for 0?910 handwritten digits' recognition. The first one is online recognition algorithm based on distance feature vector, and the second one is offline handwriting recognition algorithm based on Deep Belief Net(DBN). Specifically, the online algorithm based on distance feature vector employs Dynamic Time Warping (DTW) distance and Support Vector Machine(SVM) for 3D handwriting recognition, while the offline algorithm is training a deep belief net model for 3D handwriting recognition. The online algorithm refers to dealing with time series information in 3D trajectory. It firstly generates the distance feature vector by calculating the DTW distance between one handwriting to all training samples, then classifies the distance feature vector using SVM. Experimental results show that the online algorithm with distance feature vector can effectively distinguish the 10(0?9) handwritten digits in small training number. The accuracy rate is 99.1% when the scale of training samples per class is 20, even when the training sample is only 5 still has an accuracy rate of 98.1%. On the other hand, this paper using a Deep Belief Net model to classify handwriting in off-line way which project 3D handwriting trajectory to 2D image as an input for classification. It can get a good performance even the input image scale is very small. Experimental results show that the deep belief network classification method need more training data than DTW+SVM algorithm dose, but can achieve better results. In order to make the proposed method more robust, we collect our experiment data through more than 5 months and have collected a total of more than 6000 3D handwriting digit samples and more than 2000 3D handwriting signatures.(3) Recognition of behaviors based on three-dimensional dynamic depth information of multi-points.Human pose can be well presented by the structure of human skeleton points. Therefore, human behaviors can be characterized by human skeleton points in a time series. In this paper, we take human interaction recognition problems as a case study of behavior recognition based on three-dimensional dynamic depth information of multi-points. The 3D coordinates of 20 human skeleton joints can be calculated by matching the depth information of human body model. This paper uses these 3D human skeleton points to represent human's interaction behavior, and then introduces a recognition method in a non-supervised way to do the interaction recognition. This paper has two assumptions:1. Interaction behavior between two people would do the same action.2. The sequence of human action is composed by interaction action and non-interaction actions which is largely periodic. Based on these two assumptions, this paper firstly detect the interactive action period from the video, then the interaction action periods are matched according to the similarity measurement to determine whether there is an interactive behavior. This method does not need any prior information about the interactive actions, just use a similarity metric learning to find a best metric to match the similar interaction actions. For the similarity measurement, this paper proposes a neighborhood DTW distance metric learning algorithm based on weighted combination of each feature's distance. The weighted parameter can be optimized by sum of all distances in an iterative way. The learned metric can make same class samples getting closer while different class samples getting further. It better describes the similarity of actions and improves the accuracy of interactive behavior recognition. We collected 10 interactive actions with a total of 100 videos for experiment. In our experiment, the proposed interactive period detection algorithm can get 80%-90% accuracy based on different threshold setting, and the proposed neighborhood DTW distance metric learning algorithm can get a 89.6% accuracy for 10 interactive actions' classification.
Keywords/Search Tags:depth information, outlier detection, 3D trajectory, distance feature vector, neighbourhood DTW distance, metric learning
PDF Full Text Request
Related items