Font Size: a A A

Learning Robust Object Representation Models For Multimodal Visual Tracking

Posted on:2020-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiangFull Text:PDF
GTID:2428330575463019Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Visual tracking is an important research topic in computer vision.Given the ground truth of the object in the initial frame,its task is to measure the state information of the object in the subsequent frames.In recent years,visual tracking has made great breakthroughs,but the robustness of these algorithms in the face of complex scenes or extreme conditions(e.g.haze weather,occlusion,low illumination)still needs to be improved.At the same time,with the maturity of sensor technology,sensor is increasingly widely used in the field of object tracking.The thermal infrared sensor can capture the temperature information of the object,make up for the shortcoming that the visible data is sensitive to the illumination condition,and the visible data can also make up for the shortcoming that the thermal infrared data loses a lot of detail information.The complementarity of visible and thermal infrared data can help the tracker achieve robust tracking.This paper proposes three visual tracking models based on visible and thermal infrared data.The purpose is to learn robust object feature representation and improve the performance of trackers.The main works are as follows:First,in order to mitigate the impact of noise in raw data,we propose an object tracking algorithm based on feature decomposition model.The feature decomposition model is introduced in correlation filter(CF),and robust principal component analysis(RPCA)is used to reduce the influence of noise on tracking.Generally,the correlation filter tracking is based on the raw data to construct the tracking model.However,the data in real scenes are often polluted by noise.So to improve the robustness of tracking model,we divide the data into two parts:low rank "clean" data and sparse noise data by introducing feature decomposition model.Then,the decomposition model is embedded in the tracking framework,and the "clean" data and the correlation filter are jointly optimized.Finally,the "clean" data and correlation filter is used to update the appearance model and locate the object.Experiments on a publicly available dataset demonstrate the effectiveness of our proposed method.Second,to suppress the interference of background in bounding box of objects,we propose a method based on dynamic graph to learn robust object feature representation.At present,some methods use fixed structure graph to learn object feature representation,but this method often ignores the global relationship between patches.Therefore,in order to further consider the global relationship between patches,we introduce a joint sparse representation model to learn the relationship between patches adaptively.Firstly,we divide the object into non-overlapping patches,and use the patches as the nodes to represent the objects.The node weight in the graph indicates the possibility that the patch belong to the foreground,and the edge weight represents the affinity relationship between the adjacent patches,that is,the probability of the adjacent patched belonging to the same class.In addition,considering that the initial weight may contain noise,a sparse learning method based on l1 norm is proposed to alleviate the noise influence on initial seed caused by imprecise tracking results and irregular object shape.Then,the graph,node weight and edge weight are jointly optimized in the unified framework.Finally,patch weights and features are weighted to obtain robust object feature representation.The object feature representation is input into the structured SVM framework to achieve robust object tracking.To fairly evaluate the multi-modal object tracking method,we propose a more comprehensive and accurate RGBT234 dataset.The experimental results on RGBT234 dataset show that the proposed method is effective.Third,the data in real scenes are often polluted by noise.To alleviate the interference of data noise on dynamic graph,we propose a method of learning reliable dynamic graph based on raw data.By introducing the feature decomposition and robust principal component analysis(RPCA)model,we divide the raw data into two parts:low rank "clean" data and sparse noise data.Then we construct a dynamic graph model based on "clean" data.We jointly optimize the "clean" data,modal weight,graph,node weight and edge weight in a unified framework.Finally,the node weights and features are fused to obtain robust object feature representation,which is embedded in structured SVM(S-SVM)to achieve robust object tracking.The proposed method achieves excellent experimental results on RGBT234 and proves the effectiveness of the proposed method.
Keywords/Search Tags:RGB-T Object Tracking, Feature Decomposition, Sparse Representation, Collaborative Graph Representation, Robust Principal Component Analysis
PDF Full Text Request
Related items