Font Size: a A A

Slide Transition Detection In Lecture Videos

Posted on:2020-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z J LiuFull Text:PDF
GTID:2428330599464961Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Nowadays,with the development and popularization of science and technology,the ways of learning are becoming more and more diverse.E-learning has become an important learning means to acquire knowledge.Lecture videos are one of the most important learning materials,but these videos are unstructured.If users want to find some specific knowledge,they usually have to browse the entire video,which is timeconsuming and low efficient.Therefore,it is essential to automatically extract the representative summary for lecture videos.A large proportion of the currently recorded speech videos are the ones with slide presentations,in which slide transition detection is essential.Because the switching moment of the slide means the transformation of the speech content,the lecture video can be segmented accordingly,and each segment is directed to a different slide content.This article focuses on the slide transition detection of lecture video.Given a lecture video which records the digital slides,the speaker,and the audience by multiple cameras,our goal is to find the keyframes where slide content changes.The main work in this paper includes:(1)Constructing a sparse time-varying graph to detect slide transition of lecture videos.First,each lecture video is temporally down-sampled to 1fps and frame feature descriptors are extracted and matched to divide the video into several segments.Then,by constructing a sparse graph at each moment with short video segments as nodes,we formulate the detection problem as a graph inference issue.Finally,a set of adjacency matrix between edges,which are sparse and time-varying,are then solved through a global optimization algorithm.Consequently,the changes between adjacency matrix reflect the slide transition.(2)Detecting slide transitions in lecture videos by introducing the spatiotemporal residual networks.Convolutional Neural Networks(CNN)is a powerful model for image features extraction.However,it is limited to capture spatial dimensions in video frames.Temporal dependency among video frames is important for detecting slide changes.3D Convolutional Networks(3D ConvNet)has been regarded as an efficient approach to learn spatio-temporal features in videos.However,3D ConvNet costs much training time and needs lots of memory.In order to optimize the training process,we add the Residual Network(ResNet)to the 3D ConvNet to save training time and to improve detection accuracy.Consequently,we present a novel ConvNet architecture based on 3D ConvNet and ResNet for slide transition detection in lecture videos.
Keywords/Search Tags:Lecture video, Slide transition, Feature descriptor, Sparse time-varying graphs, Deep learning
PDF Full Text Request
Related items