Font Size: a A A

Research On Compressed Domain Video Summarization Technology Based On HEVC

Posted on:2020-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:S M ZhuFull Text:PDF
GTID:2428330572473515Subject:Engineering
Abstract/Summary:PDF Full Text Request
The development of mobile Internet and intelligent multimedia technology has made digital video an important form of people recording life and getting information.However,with the rapid growth of video data,there is tremendous pressure on the storage and transmission of video.Therefore,there is an urgent need for a way to alleviate this pressure,and video summary technology just meets this need.Currently,the video summary method is mainly based on a time domain algorithm,and generally selects features such as color and texture features of a video frame for use as key frame extraction;In the compression domain,some scholars use the relevant code stream information in the compressed video stream to extract the key frames of the video summary.However,this type of method still uses the related technology in the time domain algorithm,and fails to realize the compression domain feature extraction of the video summary.In the video summary technology,the selection of video feature points and the extraction of key frames are the most critical steps in determining the quality of the final generated video summary.To this end,the thesis focuses on these two key issues in the compression domain based on HEVC(High Efficiency Video Coding)encoded video.A compressed domain video summary key frame extraction method based on improved clustering and a compressed domain video summary generation algorithm based on intra-frame multi-mode features are proposed.Firstly,it is found that the HEVC intra prediction mode has a close relationship with the image texture.A key frame extraction method based on improved clustering for compressed domain video summary is proposed in this paper.The method firstly obtains the number of intra-luminance modes at the HM(HEVC Model)decoding end and constructs it into a mode feature vector as a texture feature of the image.Then we use ISODATA(Iterative Self-organizing Data Analysis Techniques Algorithms)to solve the problem that K-means algorithm cannot extract the set number of clusters,and realize the algorithm to obtain K-values adaptively.Finally,the improved K-means clustering algorithm is used to cluster the pattern feature vectors and select the frame corresponding to the intermediate vector in each class as the candidate frame.The similarity is used to re-screen the candidate frames,and the redundant frames are eliminated to further improve the quality of the video summary.The experimental results show that the proposed algorithm can extract key frames from Open Video Project dataset with 71.5%accuracy,91.2%recall rate and 80.2%summary index F-score.The generated video summary can express the original video content well and conform to the user's viewing habits.Secondly,in order to overcome the shortcomings of the incomplete selection of video feature points,the quality of the final generated video summary is not good.The paper proposes a compressed domain video summary generation method that combines multiple pattern features.The algorithm is based on the luminance and chrominance prediction mode information of the H.265/HEVC coded I frame video,and the similarity detection of the video frame is realized by the normalized histogram of multiple mode information.Thereby,the extraction of the key frame of the video summary and the generation of the digest are realized.Firstly,based on the 4×4 size of the PU block,corresponding weights are assigned to PU blocks of different sizes to eliminate the influence of different sizes on feature selection.The statistically obtained weighted luminance mode and chrominance mode features are then normalized,and the normalized luminance and chrominance mode histograms are assigned in consideration of the factors that human visual sensitivity to luminance components is greater than the chrominance components.Corresponding coefficients to address the effects of luminance and chrominance.Finally,the chromaticity and luminance mode features are merged to form the final model feature vector,and the pattern feature histogram model is constructed.In the aspect of video summary generation,based on the established pattern feature histogram model,two methods are proposed to detect the similarity of video frames and generate video summaries.(1)A key frame extraction method based on histogram intersection is proposed.the method firstly uses the intersection distance of the adjacent two-frame mode feature histograms to perform similarity determination,then classifies each video frame,and finally selects each type of first frame as a key frame to generate a video summary.(2)A key frame extraction method based on histogram difference is proposed.The method firstly uses the difference duty ratio of the corresponding interval of the adjacent two-frame feature histogram to determine the similarity of the corresponding interval,and counts the number of intervals with small interval similarity.Then,the number of intervals obtained by the statistics is divided by the total number of intervals per frame to obtain the ratio of the number of intervals,and the similarity determination is performed on the two frames.Finally,the video frames are classified,the first frame of each class is selected as a key frame,and the video summary is generated in the order of the video frame number.Finally,the paper has determined the optimal values of related parameters in the algorithm through a large number of experiments,and applied the proposed algorithm to the video of different subject types.The experimental results show that the accuracy of the proposed method(1)in extracting key frames on the Open Video Project dataset is 79.2%,the error rate is 0.65,the accuracy is 63.4%,the recall rate is 79.2%,and the F-score is 71.4%.The proposed method(2)extracts key frames on the same data set with an accuracy of 82.0%and an error rate of 0.64.Among them,the accuracy is 65.0%,the recall rate is 82.0%,and the F-score is 73.3%.Compared with the time domain algorithm,the proposed compression domain algorithm does not require parameter decoding,so the time-consuming algorithm is very short,and the accuracy of key frame extraction is the highest.
Keywords/Search Tags:Video summarization, HEVC, Compressed domain, Feature selection, Key frame extraction, Clustering, Multi-mode feature, Histogram model
PDF Full Text Request
Related items