Feature Representation For Multi-object Tracking Based On Attention Mechanism And Feature Decoupling

Posted on:2022-06-11

Degree:Master

Type:Thesis

Country:China

Candidate:J Ma

Full Text:PDF

GTID:2518306512452304

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Multi-object tracking(MOT)is an important topic in computer vision field,with the task of determining the motion trajectories of all instances in a video sequence.As a fundamental research,MOT has been widely used in the fields of autonomous driving,intelligent monitoring,and human-computer interaction,etc.In recent years,multiobject tracking has achieved great advancement due to deep learning technique.However,the task of MOT is still challenging because of the variable number of tracked targets,the mutual occlusions among targets,the interference of complex background,the tracking drift by detectors,etc.Currently,tracking-by-detection is the mainstream framework of MOT,which consists of three parts: global target detection,affinity model(also called association model)design,and inference of association state.Global target detection is responsible for detecting all the targets of interest in the video sequence frame by frame,affinity model aims at extracting features for each detection response(or trajectory)and measuring similarities between them.Based on these similarities,the task of inference module is to solve a global optimal problem for association and generate the motion trajectories of all the targets of interest.Under the tracking-by-detection framework,this paper utilizes deep learning technique and has made an in-depth research on feature representation learning in affinity model.The main contents are as follows:(1)The affinity model based on spatial attention mechanism.Spatial attention mechanism is an efficient means to handle mutual occlusions and detector's drift.This paper studies and improves a siamese architecture based spatial attention network.Specifically,aiming at the shortcoming in the original network that ignores the spatial structure information existing in each channel,Intersection over Union(Io U)is proposed to substitute weighted pooling as feature fusing strategy.The outputs of the improved model are used to calculate the similarity scores of each detection response pair,and the Hungarian algorithm is performed state association,resulting in trajectories of multiple targets.The experimental results demonstrate that the proposed model can improve the accuracy in data association and achieve multitarget tracking with a better performance.(2)The affinity model based on spatial-temporal attention mechanism.In complex scenes,it is hard to guarantee the tracking performance by relying solely on spatial attention mechanism.In this situation,the dynamic information of the tracked targets in time domain could be exploited to improve robustness of the affinity model.This paper proposes a spatial-temporal attention network,by which the spatiotemporal relationships are modeled for detection responses.Compared with the spatial attention network in above chapter,more discriminative spatio-temporal features are learned to facilitate feature representation ability.We conduct experiments on the dataset in MOT Challenge and show the validity of the presented network model.(3)Feature representation learning based on decoupling for foreground and background.The core of spatial-temporal attention mechanism is to suppress interference and strengthen effective information.Along this line and from the perspective of feature decoupling,this paper makes an attempt to introduce the generative adversarial network and generative representation learning into multi-object tracking.To this end,the appropriate network architecture as well as the corresponding loss functions are designed elaborately,such that the foreground is decoupled from the background with the designed network model.While the foreground feature is discriminative to different identities,the background corresponds to the clutter in the scene except the foreground.The self-encoder-decoder framework and self-attention mechanism are employed in the model.Experimental results show that compared with several state-of-the-art approaches,the proposed method achieves comparable or superior tracking performance.

Keywords/Search Tags:

multi-object tracking, affinity mode, attention mechanism, feature decoupling, generative adversarial networks

PDF Full Text Request

Related items

1	Research On Object Tracking Algorithms Based On Adversarial Transfer Learning
2	Research On Facial Expression Synthesis Based On Generative Adversarial Networks
3	Research On One-Stage Object Detection Algorithms Based On Generative Adversarial Network
4	Research Of The Color Image Restoration Algorithm Based On Generative Adversarial Networks
5	Research On Image Shadow Removal Based On Generative Adversarial Networks
6	Research On Text To Image Generation Algorithm Based On Attention Mechanism And Generative Adversarial Networks
7	Research On Group Behavior Prediction Method Based On Generative Adversarial Mechanism
8	Study On Super-Resolution Reconstruction Algorithms Based On Generative Adversarial Networks
9	Image Semantic Segmentation Based On Generative Adversarial Networks And Self-Attention Mechanism
10	Research On Dehazing Algorithms Based On Generative Adversarial Networks