| As the number of cameras in today’s cities has increased significantly,this provides great convenience for on-site monitoring and accident tracking.Since there is less effective information in the surveillance video,if you need to search or classify pedestrians according to their attributes,the workload of manual sorting will be huge.Therefore,in order to save work costs,it is necessary to study an end-to-end model that automatically recognizes pedestrian attributes.Although many pedestrian attribute recognition methods have been proposed,there are many difficulties to be solved.Three typical problems to be solved are:(1)The spatial dependence of attributes and the semantic relationship between pedestrian attributes are ignored;(2)Different attributes have different requirements for feature granularity;(3)There is a relatively obvious sample imbalance in the pedestrian attribute data set.Therefore,in response to the above 3 common problems,this thesis proposes a pedestrian attribute recognition model based on dual attention mechanism.The model includes three contents:(1)Using the spatial self-attention mechanism to extract the long-range dependence of the pedestrian attribute position features,and using the channel self-attention mechanism extracts the semantic relationship features between attributes.Combining spatial and channel self-attention features to achieve complementary advantages and disadvantages between features.And this thesis also optimized the self-attention mechanism to reduce the computational complexity of the model so that more suitable for application;(2)This paper adopts the method of fusion of self-attention features and global features to enrich the diversity of features to meet the feature needs of different attribute classifications.Attention features are conducive to the recognition of local attributes;global features are conducive to the recognition of abstract and large-grained attributes.Also add some global information for local details;(3)Using the distribution of each attribute in the dataset as prior knowledge,the classic binary cross-entropy loss function is weighted to reduce the negative impact of sample data imbalance on the model recognition effect.In addition,this article also improves the classic mean pooling method,and introduces a generalized average pooling method.By adding very few parameters,the model can choose a better pooling way that is more suitable for the task.The results of the two pedestrian attribute datasets,PETA and RAP show that the pedestrian attribute recognition model based on the dual-domain self-attention mechanism proposed in this paper can effectively solve the problem of multi-granularity of pedestrian attribute characteristics,improve the ability to express the relationship between attributes,and reduce the negative impact of sample imbalance.The average accuracy index of the original model is 4.63 and 4.81 percentage points higher than that of the reference model,respectively.The model index after optimizing the calculation complexity of the original model is still 3.93 and 4.53 percentage points higher than that of the reference model.In the proposed pedestrian attribute recognition model,it has strong competitiveness.,.The proposed method of this paper has certain value in both theoretical research and practical application. |