Font Size: a A A

Content And Semantic Based Video Shot Classification

Posted on:2010-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:H ShenFull Text:PDF
GTID:2178360275970292Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of multimedia and network technologies, a growing number of video shots and online video websites have emerged. Therefore, content and semantic based video shots retrieval and classification system has become a popular research area.Video shots are a collection of consecutive frames arranged by time, thus analysis of video shots could be separated into two perspectives: time and space. Analysis based on space can utilize the existing image feature extraction technology, whereas analysis based on time needs structural analysis and process upon video data, retrieving the key frames and getting dynamic features within shots. The combination of both static and dynamic features can sum up the content of video shots. At the same time, the traditional classification system of video shots doesn't take the high-level semantic information of video shots into consideration, which leads to the formation of"semantic gap"between the low-level visual features and the high-level semantic information. Therefore, the insertion of semantics feature analysis into the classification system is essential. Semantic based video shots classification system can be achieved through the deduction from low-level visual features to high-level semantic information. In view of this, the article conducted research in two perspectives mentioned above, and provided solutions for the inadequacy that has existed in currently used methods.After visual features of video shots retrieved, mutual information theory is utilized to study the discriminative power of single visual feature. The method, bearing a strong theoretical foundation and independent on type of the classifier, analyzes the discriminative power of visual features, thus expressing the inner-link of video shots features and their categories. In the experiments, the error rate of the SVM classifier reflects the correctness and effectiveness of the adoption of mutual information in feature analysis. The SVM classifier is then used to analyze the relationship of complementary or redundancy between the video features, in order to figure out the optimized feature combination. The combination for real person/animation is RGB advanced color moment+Edge Dynamics; the combination for character/landscape is RGB advanced color moment + Gabor texture features + Edge Dynamic; the combination for sports/entertainment is Edge direction histograms +Color Dynamic features. Finally, high-level semantic feature extraction and analysis is utilized in the classification of ball game video shots. The portion of key frames in a video shot and portions of court pixels in the key frames are calculated to divide video shots into on-court part and off-court part. Furthermore, the portion of court pixels is used to classify on-court shots into long-range and close-range, and the portion of edge regions pixels is used to divided off-court shots into coaches' part and audience' part, thus forming a multi-level video shots classification system for ball games.
Keywords/Search Tags:low-level visual feature, feature analysis, high-level semantic feature, shot classification
PDF Full Text Request
Related items