Font Size: a A A

Study On Moving Human Body Detection In Real Time Video Surveillance System

Posted on:2010-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:D Y LuanFull Text:PDF
GTID:2178360272996380Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
We have done some profound study on moving human body detection in Video Surveillance in this thesis, and finished a moving human body detection system based on video.There are many kinds of study including camera calibration, motion segmentation and tracking, object classification, multiple camera combination, high level semantic understanding in Video Surveillance, it is becoming one of the most active research direction in computer vision. It can be applied to many practical applications, and possess potential great economic value. It has also intrigued great interest from many research institutes and researchers. Among them, human detection is the main research direction, it is not only applied to intellectual video surveillance, but also intellectual man-machine interface and virtual reality, etc.The problem of moving human detection are divided into two large problems in this thesis, first of all, we do the motion detection to segment the motion section in the video, then to classify extracted motion region. The study of the motion detection in the thesis is on condition that the camera is static, so this kind of motion detection is more simple than the circumstance that the camera is moving. To the problem of human recognition, the method of machine learning has been used. There are a lot of difficulties on both the section of motion detection and the section of human detection to be resolved.In the case of static camera, the most direct method to detect the moving object in the video in real time is frame differencing, the most frequently-used method is background subtraction. The so-called frame differencing is to get the difference between two consecutive frame, then to threshold the difference, thus we can get the motion region which consist of white pixels. To solve the problem caused by the situation that the object move slowly, double frame differencing method can be used. It means that to get three consecutive frames, and then do double differencing. Background subtraction is a frequently-used method for motion segmentation, this algorithm is to estimate an background model which doen't include moving object, the motion region is located through computing the difference between the current frame and the background model, the background model is dynamically updated by the detection result. The main difference among different kinds of background subtraction algorithm is that they adopt different kinds of background model and updating algorithm. The most common background subtraction algorithm is to describe the probability distribution of gray value of each pixel using a statistical model, in practice, gaussian distribution is most frequently used. We use different updating coefficient for different detection results when updating background model, in order to decide to whether to change the preceding distribution. Background subtraction is extensively used in many different kinds of applications, not only in video surveillance but also in virtual reality, teleconference and three-dimensional modeling.In order to get a high speed, three frame differencing algorithm combined with blob merging was adopted. Three frame differencing is the extension of frame differencing. The implementation of it is quite simple, the three-frame differencing rule suggests that a pixel is legitimately moving if its intensity value has changed significantly between both the current image and the last frame, and the current image and the next-to-last frame. During the detection procedure, the threshold need to be updated in real time so as to adapt the change of background. Because of the fact that frame differencing is generally not an effective method for extracting the entire shape of a moving object. It means that pixels interior to an object with uniform intensity aren't included in the set of"moving"pixels. To overcome this problem, we adopt an algorithm that can get all the pixels interior to the object and merge the blobs into one entirety. After getting the foreground image, we need to extract the connected domain from this binary image, thus we can get the size and location of the moving object in the image. To extract the connected region, twice scanning label algorithm.After getting the motion region, the next task is to classify this region, in order to judge whether it is belong to human or not. This is a typical pattern recognition problem, we will adopt the method based on machine learning to classify the unknown region. The basic rule of human recognition based on machine learning is that to extract the feature of the object, then selecting one of the applicable machine learning algorithm to train a classifier according to this feature. Finally, we use this gotten classifier to classify the object. The feature that we select would be shape descriptor(like contour), color descriptor(like the color of skin),or the combination of some other features.As to the selection of the feature, we adopt the combination of haar-like and HOG.. The haar-like feature was also called rectangle feature, it is the difference between the sum of pixels interior in different rectangles, we adopted five kind rectangle feature in this thesis. Histograms of Oriented Gradient(HOG) based on contour and gradient is a histogram composing of the projection of direction gradient of all the pixels in the rectangle. Each detection window consists of some overlapping blocks, each block composes of several cells, For each cell ,all the gradient are projected to several directions, and this can form a histogram. Finally, the histograms of each cell are connected to one large feature vector, then the feature vector has to be normalized, thus each block has a corresponding HOG feature. As a result of the large number of the features, we introduced the integral image. The integral image can be computed in one pass over the original image, the sum of the pixels within the rectangle can be computed through one addition and two times subtraction, so the computation speed is increased dramatically. As extracting the HOG feature, every HOG feature is a 36-dimension vector, as every block is composed of 4 cells. Thus we used the fisher linear discrimination to reduce its dimension.As to training the human classifier, adaboost algorithm was adopted. First of all, the weak classifier was trained, then the strong classifier was trained. The training procedure is actually a feature selection procedure. Firstly, every sample is initialized a weight, then all the samples are classified by every weak classifier, thus the error rate for every weak classifier can be computed, this error rate equals to sum of the weight of the sample which is not classified correctly. Consequently, the weak classifier that has smaller error rate is the better ones. For each round of feature selection, the weak classifier which has the smallest error rate was selected. Before the next selection round, every weight of sample need to be updated so as to lower the weight of the sample which was classified correctly, conversely, the weight that was not classified correctly was increased. It means that the samples that are hard to be classified are paid more attention in the next round.Through testing, this system has got a good detection result with the detection speed 30f/s.On the basis of the work of other researchers, some new ideas have been proposed in this thesis, in face of the drawback of frame differencing, we proposed blob merging algorithm, we also extended the feature set of haar-like and detected the window through multiple scale method. Finally, we induced the dimension of the HOG feature avoiding using svm.
Keywords/Search Tags:video surveillance, motion detection, human body detection
PDF Full Text Request
Related items