Font Size: a A A

Description Of Video Content Based On Key Frame

Posted on:2014-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:L K ZhangFull Text:PDF
GTID:2248330398461460Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of multimedia technology, the description of video content has become a hot research topic. Nowadays, the application of Human Visual System (HVS) on describing video content attracts more and more research interests and it is applied widely in video retrieval, intelligent surveillance, video compression, video copy detection and so on. Meanwhile, intelligent surveillance system also has urgent requirements on the description of video content, especially on event detection of surveillance videos. Therefore, how to exactly describe the events in surveillance video is one of the highlights in these related fields.In this paper, we proposed a method of description of video content based on key frame, in which we use visual attention shift mechanism based on spatial-temporal visual attention model that meets HSV. It is used in event detection in surveillance video. At the same time, we do some research about the face detection and face tracking. In this paper, we first introduce the state of the art of the description of video content and then introduce the theory of visual attention model. In the following, we emphasize the new method we propose spatial-temporal visual attention model. Based on the spatial-temporal visual attention model, we extract the key frames according to the human visual attention mechanism to describe the video content and apply them to the event detection of surveillance videos. Finally, we research for the face detection and face tracking so that we can put faces as high level feature to improve our spatial-temporal visual attention model in the future research.The main innovations and contributions in this paper are as follows:(1) Form a new spatial-temporal visual attention model. The model adds thetemporal information of video based on our lab’s results. The temporal and spatialvisual attention models are fused by the weight which is determined by temporalattention model to form the final spatial-temporal visual attention model that meets (2) We propose a visual attention shift-based event detection algorithm for intelligent surveillance, in which the temporal and spatial visual attention regions are detected to obtain the visual saliency map, and then the visual attention rhythm is derived from the visual saliency map temporally. According to the visual attention rhythm, the key frames are selected out to label the occurrence of the events. At the same time, the likely to be concerned objects in the key frames are exacted and tracked in the former and latter frames.(3) A face detection and tracking algorithm based on AdaBoost and CAMSHIFT is proposed. The algorithm improves the face tracking, in which uses accumulating histogram as the evidence of tracking and constantly change the size and position of the target window. We resolve the problem that if the color of faces is similar to the background color, the faces are easy to lose tracking.
Keywords/Search Tags:spatial-temporal visual attention model, visual attention shift, key frame, event detection, face detection and tracking
PDF Full Text Request
Related items