Font Size: a A A

Structured Sparse Coding Approach For Video/Image Collection Summarization

Posted on:2016-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiuFull Text:PDF
GTID:2308330503456363Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Summary of video or image collection is very important to enable browsing and organization and also plays an important role in robotics. Althoughsummary technology based on key-frame selection has been extensively studied during the past two decades, many previous methods mainly focused on the structured videos. The key-frame selection for unstructured video is still a challenging problem. On the other hand, summary technology based topical objects discovery approach only obtains a very coarse discovery result. In this paper, based on sparse coding and dictionary learning, for the problem of video or image collection summary, we do research on exemplar extraction, key-frame selection and topical object discovery. Our contribution is as follows:(1) To solve the basic exemplar extraction problem, we propose a structured sparse coding model to simultaneously characterize the representativeness, diversity, and robustness. The 1, 2L norm is introduced to isolate the outlier and improve the robustness, and the inhabitation term is introduced to prevent some samples to be simultaneously selected. To solve the optimization problem, we adopt the alternating directional method of multiplier technology to design an iterative algorithm. The validations on various data sets show that the proposed model obtains promising results.(2) To solve the summary problem of unstructured video, we also propose a structured sparse coding model. In this model, a mutual inhabitation penalty term constructed by using simple temporal redundancy is imposed to prevent similar samples from being selected simultaneously. The constructured objective function is nonconvex and an iterative alg orithm is developed to solve the optimization problem. The performance is evaluated using various video clips from You Tube and a practical video captured by an indoor mobile robot. The results clearly indicate that the proposed strategy helps the optimizat ion model to achieve more diversified key frames than the other existing work method.(3) To solve the topical objects discovery problem, we adopt the most recently developed objectness method to extract candidate objects. To select the topical objects from the objectness bounding boxes, we develop a dictionary learning model which incorporates the sparsity constraint. Further, we design a globally convergent iterative algorithm to solve the dictionary learning problem. Finally, we evaluate the proposed method and some existing methods such as LDA and NMF and design a new objective evaluation metric to compare different methods.
Keywords/Search Tags:exemplar extraction, key-frame selection, topical object discovery, video, imagecollection
PDF Full Text Request
Related items