Font Size: a A A

Research On Single Object Tracking Algorithm Based On Semantic Aware And Confidence Correction

Posted on:2024-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:Q F LiangFull Text:PDF
GTID:2568307058972599Subject:Computer Science and Technology
Abstract/Summary:
Visual object tracking is one of the fundamental research topics in the field of computer vision.Based on the object given in the initial frame,it models the appearance of the object and predicts its coordinate position,scale,and motion trajectory in subsequent frames.This technology has been widely applied in fields such as autonomous driving,human-computer interaction,intelligent transportation,intelligent monitoring,and military reconnaissance.With the rapid development of deep learning and big data technology,many excellent research achievements in object tracking have emerged in recent years.Among them,object tracking based on Siamese networks and attention mechanisms has outstanding tracking performance and has received significant attention from scholars.However,these two methods also have certain limitations.For example,Siamese network-based object tracking lacks the utilization of object semantics,resulting in problems such as offset of the tracking bounding box.On the other hand,attention-based tracking algorithms,especially transformer-based tracking algorithms,focus too much on the feature of the object boundary,resulting in a larger tracking box.This thesis proposes improvement solutions to address these problems,and the research achievements are as follows:(1)A object tracking algorithm that integrates a semantic-aware network is proposed to address the problem of semantic information loss in the twin-network-based tracking algorithm.In this thesis,a semantic-aware network is integrated with a twin-tracking network through ensemble learning.Both the twin network and the semantic-aware network are trained simultaneously with semantic labels,and the twin-tracking network and the semanticaware network are combined using semantic label re-detection.The semantic label redetection algorithm records the semantic information based on the results of the semanticaware network.When the object’s semantic label appears inconsistent before and after,the re-detection algorithm determines whether the tracking bounding box has shifted.For the shifted bounding box,the re-detection algorithm modifies it to the bounding box with the correct semantic label.If the object’s semantic label is consistent before and after,the redetection algorithm scales the tracking bounding box according to the semantic detection bounding box.This method is highly compatible and can be directly embedded in the current twin-network-based tracking algorithm to enhance tracking performance.Experimental results on multiple large-scale tracking datasets also demonstrate that the proposed algorithm significantly improves the accuracy and robustness of the benchmark algorithm while ensuring real-time tracking.(2)A classification confidence score correction algorithm based on TransT tracking is proposed to address the issue encountered in obtaining tracking bounding boxes in the attention-based Transformer Tracking(TransT)algorithm.The TransT algorithm utilizes self-attention and cross-attention mechanisms to design a feature fusion network,while abandoning the correlation calculation operation found in the twin-network-based tracking network.However,in complex scenes,the TransT algorithm still relies on a two-stage detection method based on classification and regression to obtain tracking bounding boxes,which often leads to an overconfident classification confidence score problem.This means that the highest confidence score may not necessarily represent the true best match but rather the optimal semantic interpretation.To tackle this problem in the TransT algorithm,we introduce a classification confidence score correction algorithm based on TransT tracking.This algorithm selects multiple high-scoring candidates within the classification branch for regression and utilizes the regression bounding boxes for image segmentation.The segmented image is then subjected to logarithmic polar coordinate mapping and Gaussian blurring.By calculating the Euclidean distance between the transformed candidate target image and the template image,the similarity between the candidate target and the template is obtained.Finally,leveraging correlation as a crucial factor,adaptive classification confidence score correction is performed.Experimental results demonstrate the effectiveness and superiority of this method,significantly enhancing the tracking capability of the algorithm.
Keywords/Search Tags:Visual object tracking, Siamese neural networks, Attention mechanism, Semanticaware, Re-detection, Confidence
Related items