| Tangka,as a unique painting art form in Tibetan culture,has a wide range of themes including astronomy,geography,natural science and humanities.Tangka spreads Tibetan culture through visual images and tells the story of ethnic development.Therefore,using object detection technology to recognize and detect Tangka has high research value and significance.Tangka images have complex color features and texture features.Traditional object detection focuses more on extracting the outline of the object while Tangka focuses more on extracting its texture.Therefore,the application of convolutional neural networks in Tangka main deity detection is limited at present.At the same time,there are many categories of Tangka main deities without dedicated datasets which increases the difficulty of this task.In order to achieve higher accuracy in detecting and recognizing Tangka main deities’ images,this paper conducts research on deep learning-based methods for detecting and recognizing them with specific research contents as follows:(1)Establishing a Tangka master image detection dataset.Collect relevant images through improved code and ultimately select 14 categories.Use the improved Grid Mask to enhance the data,compare it with traditional and Cutout data enhancement methods,and validate it on two network models.The results prove that Grid Mask data enhancement is more suitable for the Tangka dataset,and the recognition and detection accuracy reaches 96.52%.(2)Constructed a loss function objective detection model for Tangka master dignity.Six loss function are proposed and improved,which can be integrated into multiple network models.Tangka images contain smaller objects,which reduces the accuracy of network recognition.A new loss function is constructed.By adjusting the weight of different size objects in the loss function,the network focuses on small objects that are difficult to identify,and improves the detection ability of the model for small objects.By integrating them into the two network models of YOLOv4 and YOLOX,the detection accuracy on the common data set VOC2007 and COCO has been improved by 1.4% and 0.4%,and 0.1%and 1.6% respectively.Choosing the appropriate function model(YOLOX-F)improves the detection accuracy by 0.73% and the accuracy rate reaches 97.25% on Tangka dataset.The experimental results show that the proposed object size loss function can effectively improve the detection accuracy of relatively small objects without affecting the detection accuracy of large objects.(3)Designed a Thangka Dominant Parallel Attention(PAM)object detection model based on YOLOX-F.To improve the detection accuracy of Thangka main image,make the detection network pay better attention to the objects in the images and prevent the spatial attention mechanism and the channel attention mechanism from influencing each other.This paper proposes the PAM attention mechanism,which allows the feature maps obtained from the images after the convolutional network,and then pass through both the channel attention mechanism and the spatial attention mechanism,and finally fuse them to get the output.Embed it into the YOLOX-F model to enhance the model’s attention to Thangka image objects,thereby improving the detection accuracy of objects in Thangka images.The experimental results demonstrate that the parallel attention mechanism based on YOLOX-F can achieve better results,and in the Thangka image recognition task,the detection accuracy of YOLOX-PAM+F model has been improved by 0.79%. |