Font Size: a A A

Research And Implementation Of Video Action Search System Based On Temporal Action Detection

Posted on:2021-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:H F ShiFull Text:PDF
GTID:2428330632462643Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet multimedia technology and the rise of short video applications,tens of thousands of videos are shared to the Internet every second.People's demand for video processing tends to be diversified.Compared with the traditional video classification,people are often more concerned about how to search for action segments of interest in the ocean of hundreds of millions of videos.However,video has the characteristics of large amount of data,multifarious format and rich content.The traditional manual annotation method wastes a lot of labor and time costs.How to efficiently and intelligently detect the action segments in videos has always been a difficult problem in academia and industry.In view of the above background,this paper proposes two types of video temporal action detection algorithms based on deep convolutional neural network through the research of deep neural network,object detection algorithms and video temporal action detection algorithms.Neural network is introduced to identify and detect action in videos and implement the video action search system based on the proposed temporal action detection algorithm.The main research contents and results of the paper are as follows:(1)A temporal action detection network(DCL)based on decoupling classification task and localization task is proposed.DCL network abstracts the localization task and classification task of temporal action detection task into action sensitive module and classification sensitive module respectively.The action sensitive module can improve the accuracy of the action localization by learning the action boundary migration information,and the classification sensitive module can improve the accuracy of the action classification by learning the video semantics information.(2)The Multi-Scale Cascade network is proposed to accurately detect the temporal action segments in the video.The multi-scale regression unit designed in the network can guide the training of the network by setting corresponding learning tasks for the video segment features of different scales and accelerate the convergence of the network.The knowledge learned by each level of regression unit will provide reliable prior knowledge to the subordinate regression unit,which greatly improves the accuracy of action boundary localization.(3)The Multi-Scale Cascade network based on the paper is designed and implemented a video action segment search system.Based on the uploaded video files and action keywords,the system returns video clips of all corresponding action segments in the video.The system also implements the keyword search algorithm based on the word semantic and the keyframe selection algorithm based on the action semantic.After strict system testing and verification,the system can make timely and accurately action segment search for user's videos.
Keywords/Search Tags:deep learning, object detection, temporal action detection, action segment search
PDF Full Text Request
Related items