Font Size: a A A

Video Cover Extraction Algorithm Based On Deep Learning

Posted on:2020-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:R Z LiFull Text:PDF
GTID:2428330572473672Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The continuous increase of information transmission speed and the popularity of photographic equipment have aroused people's enthusiasm for video sharing and creation.However,only a few of video creators have time and ability to edit video covers,causing a large number of videos which lack premium covers on websites,reducing the efficiency of video sharing and retrieval.In view of above problems,this thesis models the visual aesthetics and content representation of video frames,to build a general video cover extraction algotithm.Then,for the human-themed videos,the face recognition model is built to obtain the facial semantic information in the video frames,embedding the information into the general video cover extraction algorithm above to ensure that the extracted cover contains the main characters of the video,which further improve the qutilty of video cover extraction.In the process of research,we achieve the following results:First,we propose an improved DenseNet with feature recalibration mechanism.The feature recalibration mechanism is built to model the channel correlation between the features in DenseNet,enhancing the useful features and compressing the useless features,thereby improving the model's ability of extracting video frames' features,offering the foundation for aesthetic evaluation and representative evaluation.By comparison with the similar models like MNA-CNN and ILGNet etc on the same test set,the superiority of the model has been verified.Second,we replace the global pooling layer at the end of the ShuffleNet network with depth wise separable convolution layer,to learn the relative position distribution of face key points which generated by face alignment.Face alignment maps the facial key points in the image to a specified area to achieve a stable distribution of the main facial features.This thesis uses depth wise separable convolution layer to obtain this distribution,comparing with the global pooling layer of the original network,the accuracy of the face recognition task is improved.The algorithm of 20MB size achieves 96.37%face verification TAR(FARle-6)on MegaFace Challenge,which exceeds the effects of MobiFace,FaceNet and other similar models,and offering more accurate facail semantic information to cover extraction algorithm.Third,a multi-dimensional features video cover extraction method is proposed,which combines aesthetic features,representative features and face features of frames.The optimal video frame is selected as cover according to multi-dimensional features.In this thesis,experiments on multiple video datasets have demonstrated the algorithm effectiveness.
Keywords/Search Tags:Video Cover Extract, Nerual Network, Feature Recalibration, Face Recognition
PDF Full Text Request
Related items