In recent years,pose estimation tasks based on deep learning have achieved outstanding results.At present,most of the work in the field of pose estimation focuses on human poses.Because animal pose estimation has huge applications in the fields of natural animal protection,animal behavior analysis,and animal breeding,so this thesis focuses on animal pose estimation,and uses deep learning and other technologies to solve problems in this field.This thesis tries three tasks.The description and contributions of the work are as follows:(1)This thesis notices the pain points of the current mainstream work in the field of animal pose estimation based on the idea of unsupervised domain adaptation,and proposes an animal pose estimation algorithm based on the idea of semi-supervised domain adaptation.The algorithm uses synthetic animal data sets and some real animal labeled data to complete the task of pose estimation.Compared with the method based on unsupervised domain adaptation,although this algorithm increases part of the labeling cost,it greatly improves the accuracy of the model.In the real world,animals of different populations often have great differences.Therefore,we cannot only rely on transfer technology to transfer knowledge from other domains.By paying a part of the labeling cost,the accuracy of the algorithm can be greatly improved.Algorithms based on the semi-supervised domain are closer to real application scenarios.(2)Work 1 mainly focuses on how to use synthetic animal data and some real animal labeled data to complete the pose estimation task.Since the acquisition cost of real animal labels is relatively high,the method based on semi-supervised and semi-supervised domain adaptation ideas is more appropriate,but this type of method needs to specify which samples need to be labeled in advance.At present,most methods use random sampling to generate labeling schemes.This thesis notices that the quality of labeled samples has a great impact on the quality of the model.It is more obvious when the budget is small.In order to further improve the effect of the algorithm based on semi-supervised or semi-supervised domain adaptation ideas,this thesis proposes a labeling sample selection algorithm based on the idea of clustering.Sampling a better labeling scheme.(3)Many works have introduced models based on Transformer architecture to computer vision tasks,and achieved good results.In order to comprehensively utilize the advantages of models based on CNN architecture and Transformer architecture,and use them for animal pose estimation tasks,this thesis after exploring the combination of the two,a model Re Swin based on the encoder-decoder structure is proposed.By fusing the features of the two architectures and designing a multi-scale structure in the decoder module.The accuracy of Re Swin is significantly better than the two single-architecture models. |