Font Size: a A A

Tracking Algorithm Research Based On Deep Learning And HEVC Codec Info

Posted on:2022-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:K SongFull Text:PDF
GTID:2518306524976469Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Visual Target Tracking is a type of technology that detects,extracts,recognizes and tracks moving targets in an image sequence.The main task is to obtain the motion parameters of the moving target,such as position,speed,acceleration,and motion trajectory,etc.,in order to realize the behavioral understanding of the moving target,and complete the higher-level detection task.Target tracking algorithms are widely used in related scenarios such as public security,factory production,and traffic control.The research on visual target tracking algorithms has always attracted much attention in the field of computer vision.Although the research on related algorithms in the pixel domain has made great progress,especially in the past two years,under the empowerment of deep learning,algorithms in the field of surveillance and security have been commercialized.However,they are all directly based on the images obtained by the camera to achieve the relevant target tracking algorithm.In other words,the information about the motion in the output code stream of the video encoding end is not paid attention to.In view of this,this paper is based on the prediction of the target bounding box intersection in target detection,and modulates the tracking process by fusing the video motion information obtained by the HEVC codec framework,and proposes the DH-Net network structure to study the effect of this method on improving the performance of intelligent monitoring in the accuracy and robustness of target tracking.The research content is summarized as follows:First of all,this article uses the HEVC coding and decoding framework to extract the motion vector and macroblock segmentation mode in the video frame stream.Among them,the motion vector can be used to describe the current target's motion trend,and the video frame macroblock segmentation method can describe the texture characteristics of the moving target to a certain extent.These two types of information will be used separately in related verification experiments to prove their effectiveness on target tracking algorithms.Then,in order to further improve the target tracking effect based on motion information,this paper proposes an optimization scheme based on step weight differential peak detection and a regression network correction scheme to correct the size and position of the target bounding box.The former uses the distribution characteristics of macroblocks around the target in the video frame,and the latter uses the idea of target bounding box regression in the target detection framework R-CNN.Compared with the benchmark experiment,the two schemes have greatly improved the accuracy of target tracking.Subsequently,in order to complete the fusion of the compressed domain and pixel domain information,this article uses the aforementioned motion information and image RGB information to respectively modulate the target tracking branch online through the two feature fusion branches of DH-Net.From the final experimental results,the fusion of pixel domain and compressed domain information helps to improve the robustness and accuracy of the target tracking process without adding too much calculation.Finally,this paper compares the proposed DH-Net algorithm with some other cuttingedge target tracking algorithms.Its performance on the common data set VOT2018 in the field of visual target tracking exceeds the benchmark algorithm Siam FC,with a difference of about 2.6% in accuracy,demonstrating good tracking effect.
Keywords/Search Tags:Target tracking, surveillance security, pixel domain, compressed domain, motion information, deep learning
PDF Full Text Request
Related items