| As one of the most fundamental and important research areas in computer vision,object detection is facing increasing demand for applications in various industries in the real world.Thanks to the development of deep neural networks and the support of large-scale public datasets,the detection performance of object detection models under theoretical conditions has been continuously improved.However,in the real world,it is often difficult to meet the theoretical assumption that the training data and test data follow the same distribution.Domain shifts caused by factors such as lighting,angle,and style often lead to significant degradation of the actual detection performance of the model.Therefore,research on unsupervised domain adaptation-based object detection algorithms has received widespread attention in recent years.By adapting the source domain trained object model to the target domain through domain adaptation,the cross-domain detection ability of the model can be improved,thus solving the dilemma encountered by object detection in applications.This thesis focuses on the research center of unsupervised domain adaptation-based object detection algorithms.After investigating and summarizing the domestic and foreign solutions in this field,we conduct in-depth research on the existing problems of current unsupervised domain adaptation-based object detection algorithms.The main work of this thesis includes:(1)To address the problem of negative transfer caused by the neglect of the differences in multi-level feature transferability and the static alignment in existing methods,this thesis proposes a domain adaptation-based object detection algorithm based on multi-level transferability measurement.Through uncertainty-based transferability measurement,a dynamic mechanism is introduced into the domain adaptation process based on adversarial learning.Local-level alignment helps the model focus on the transferability of the features to be aligned from a spatial perspective,avoiding the forced alignment of features with poor transferability.Global-level alignment helps the model identify the degree of feature alignment during the alignment process,avoiding over-alignment that crosses classification boundaries.Finally,instance-level alignment helps the model achieve fine-grained intra-class alignment through a category-aware pseudo-label generation mechanism.The model was then tested in multiple cross-domain detection scenarios and proved effective.(2)To address the issue of the imbalance between discriminability and transferability in domain adaptation caused by the overemphasis on improving transferability in existing methods,this thesis proposes a regularization term based on feature discriminability enhancement and a model optimization strategy based on Pareto optimization.By using spectral analysis based on singular value decomposition,the impact of singular values and eigenvectors on transferability and discriminability is inferred,and then a regularization term based on feature discriminability enhancement is used to penalize the problem of ignoring discriminability in the alignment process,promoting a balance between discriminability and transferability.Next,Pareto optimization theory is introduced,and the MGDA-UB multi-objective optimization algorithm is used to ensure that the model optimizes for both discriminability and transferability.This section then verifies the universality and effectiveness of this method through performance comparison experiments on the improved baseline model.(3)In response to the problem that current CNN-based domain adaptation object detection methods cannot achieve good alignment in the sequence features extracted by the DETR model,this thesis proposes a new domain adaptation object detection algorithm based on sequence feature alignment.Inspired by the class token in Vi T,the thesis introduces an adversarial token to solve the problem of context information fusion and adversarial training loop construction,and combines self-attention and cross-attention mechanisms to perform sequence feature alignment at the global and instance levels.Finally,through performance comparison with different baseline models on multiple transfer tasks,the effectiveness of the proposed model is demonstrated.Moreover,further experiments,such as ablation studies and attention visualization analysis,reveal the internal operation mechanism of the proposed method. |