Font Size: a A A

Research On Video Action Recognition Based On Compressed Domain

Posted on:2022-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:K H JiangFull Text:PDF
GTID:2518306512452204Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Video action recognition technology is a hot research topic in the field of artificial intelligence.Its purpose is to analyze human actions in video and classify them correctly.It has been widely used in security monitoring and other fields.According to the type of input video,video action recognition algorithm can be divided into compressed domain and pixel domain(non compressed domain)algorithm.Among them,compressed domain motion recognition algorithm generally has less computation than pixel domain motion recognition algorithm.This is because in the process of video coding,the time and space redundancy of video is removed,which makes it easier for the network to obtain human motion related information.Compared with the pixel domain,the compressed domain motion recognition algorithm also has the disadvantages of insufficient information and low recognition accuracy.As far as the existing compressed domain motion recognition algorithms are concerned,using motion vector instead of optical flow can reduce the computational complexity of the model,but the motion representation ability of motion vector is weaker than that of optical flow.At the same time,the ability of existing compressed domain motion recognition algorithms using 2D network to obtain motion timing information is limited,which affects the recognition accuracy.In view of these shortcomings,this paper analyzes the existing compressed domain action recognition algorithms,and proposes an improved algorithm.1.In view of the noise interference and low resolution of motion vectors and residuals in compressed domain,this paper analyzes the advantages of compressed video in action recognition task,and designs a fusion information system based on motion vectors and residuals in compressed domain.The fusion information reduces the motion vector noise,improves the accuracy of the moving target,makes the network more focused on the moving target area,and enriches the diversity of network input.In this process,the model uses the temporal continuity and spatial compactness of video frame to remove the interference noise(such as background,isolated value,etc.)of motion vector and residual;At the same time,the motion vectors and residuals are fused in the way of channel superposition to enhance the ability of the compressed domain fusion information to represent human action;Finally,the ablation experiment was carried out and the results were analyzed.Experiments show that,compared with the compressed domain motion recognition algorithm CoViAR,the proposed algorithm has higher recognition accuracy at the same computational cost,which proves the effectiveness of the fusion information in motion recognition.2.The existing compressed domain motion recognition model has limited ability to obtain motion time information,which affects the recognition accuracy.Based on efficient convolutional network ECO,this paper designs a dual stream network video action recognition model based on compressed domain information.In this model,Iframe and compressed domain fusion information are used as network input to replace RGB and optical flow in pixel domain,without pre calculating optical flow,which reduces the overall calculation cost of the model;At the same time,multi-dimensional information is input to improve the recognition performance of the model.Experimental results show that the recognition accuracy of the proposed algorithm is higher than that of DMC-net/MFCD-net,and the computational cost is much lower than that of I3D algorithm,which also verifies the recognition effect of the proposed algorithm.
Keywords/Search Tags:compressed domain, action recognition, convolutional neural network, dual-stream network
PDF Full Text Request
Related items