| Vehicle re-identification technology is the use of hardware equipment and software technology to locate and track vehicles in the monitoring system,to achieve different shooting points,different time of the same vehicle identification,compared to the manual way,in efficiency,accuracy has been exponentially improved.The proposed technology has important application value for many fields,such as providing clues to traffic evasion,plate forgery,smearing and other illegal clues,and can also provide decision support for future traffic local planning.To sum up,the proposed and development of vehicle re-identification technology has important significance for people’s life,traffic safety,police crime solving,etc.The current mainstream vehicle re-recognition algorithms have the following shortcomings: limited sensing range,loss of detailed features due to downsampling,and inability to distinguish vehicles with high similarity in appearance,resulting in no further improvement in the accuracy of re-recognition.In order to address the above problems,three CNN and Vision Transformer(VIT)based vehicle re-identification algorithms are proposed in this paper: Res Net-VIT,VGG16-VIT and Efficient Net-VIT.In this paper,the pre-trained Res Net,VGG16 and Efficient Net B0 are used as feature extractors respectively,and a batch normalization layer and a global average pooling layer are introduced to accelerate convergence and reduce the risk of overfitting.Then,a simplified version of the VIT model,including a projection layer,a multi-headed self-attentive layer,a spreading layer and a classification layer,is designed to capture long-range dependencies.It is shown experimentally that the proposed method is effective for the vehicle re-identification task,outperforms existing mainstream methods in terms of accuracy and robustness,and is more economical in terms of parameters and computational effort,providing a new idea for future research.In order to reduce the interference of background noise,this paper proposes a multi-level spatial transformation network,which incorporates multiple spatial transformation networks into the convolutional layer to remove background redundancy.In order to improve the robustness of the algorithm and to better extract fine-grained features,this paper proposes a spatial transformation-based vehicle re-identification algorithm,ESV,which incorporates a multi-level spatial transformation network into the better-performing Efficient Net-VIT,allowing for richer and more complex geometric transformations,which can extend the representation capability of the network to accommodate a wider range of geometric transformation requirements.Finally,experimental validation is carried out to verify the effectiveness of the algorithm. |