Research On Short Video Annotation Based On Shot And Scene Context

Posted on:2017-04-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:T L Peng

Full Text:PDF

GTID:1108330488492577

Subject:Digital media technology and applications

Abstract/Summary:

PDF Full Text Request

With the rapid development of digital media technology, communication technology and network technology, the quantity of the multimedia data including video data is increasing by leaps and bounds. Short video is a kind of complex video data, how to search for useful information in massive video data has always been a question for users. To answer the question, some applications as video indexing, video retrieval system, etc are put forward. Video annotation is the key step in making these applications. Nowadays, video annotation has become a hot topic in digital meadia application and computer vision.From the perspective of semantic analysis, video can be divided into several semantic units. Different semantic units have different semantic meanings and semantic annotation can be realized on each semantic unit. In this paper, On the basis of in-depth analysis of video structure and video segmentation, different semantic units come into being and video annotation is finished on each semantic unit. The research work of this paper can be summarized as below:(1) According to the global features and local features of the video frame, a new shot detection method combined with dynamic video texture and SIFT features is proposed in this paper. Firstly, blocks of two adjacent frames are uniformed, and under the RGB color space, the average gradient of each image block is calculated. Then compare the average gradient of all image blocks form video dynamic texture and judge the variation of the shots according to SIFT features of the adjacent frames. The next thing is to compare adjacent frames, dynamic texture. Finally the algorithm determines the change of the lens by combining with matching.(2) In this paper, a video semantic annotation model based on shot event is proposed. Based on the analysis of the video structure, the algotithm extracts the moving object in the scene and the background color feature of the key frames to express a shot event, and further extends to the scene expression by the method. Finally the theme of the video clip is expressed by the set of shot events in combination of contextual shot movements and surrounding backgrounds. This model can best represent the semantic meanings of shots and improve the accuracy of video sematics.(3) A new video annotation method based on semi-supervised clustering is proposed in this paper. From the perspective of the event driven, shot event is looked on as semantic unit and event group is used to annotate micro video. Next a novel annotation algorithm based on semi-supervised K-means clustering algorithm is proposed. In the clustering algorithm the objective function is optimized for getting the better results with the low coupling between clusters, high polymerization in cluster, and reflecting the local data distribution density in cluster. In this paper, the proposed clustering method can also realize clustering on heterogeneous data, such as video and improve the accuracy of video annotation.(4) A new video classification method based on contextual Multi Kernel learning is proposed in this paper. On the basis of traditional bag of word model, according to the spatial and semantic similarity between the key frames of adjacent lens, this paper brings out a new video scene classification model. Firstly, it divides video clips into many shots and extracts their key frames and make the key frames a gauge. The next thing is that the key frames as an image block produce an image on time sequence, which is extracted from SIFT features and HSV features, this paper embeds the SIFT features and HSV feature data into Hilbert space. Through Multi Kernel learning, the algorithm selects the appropriate kernel functions to training of each image, and gets the classification model at last with desirable classification effects.All those tehniques can be widely used in the domain of video classification, video index,video retrieval,video understanding and video management. Widely applications make the research significative and valuable.

Keywords/Search Tags:

Dynamic Texture, SIFT Feature, Shot Detection, Video Event, Video Annotation, Context, Semi-supervised Learning, Multi-kernel Learning

PDF Full Text Request

Related items

1	Research On Video Annotation With Machine Learning Techniques
2	Multi-level Video Annotation And Retrieval
3	Application Of Semi-Supervised Learning Algorithm Based On Kernel Density In Video Semantic Annotation
4	Video Semantic Annotation Methods And Theoretical Research
5	Research On Semi-supervised Few-shot Learning Method Based On Ensemble Learning Strategy
6	Research On Several Issues In Video Semantic Annotation
7	Research On Semi-supervised Classification Algorithm Based On Integrated Neural Network
8	On Research And Prototype Implementation Of Video Semantic Annotation
9	Event Video Annotation By Learning From Multiple Source Domains
10	Semi-supervised Generalized Zero-shot Learning Based On Modal Fusion