The human-centered image analysis technology has important research significance,and has a wide range of application values,such as intelligent security,virtual fitting,pedestrian recognition,etc.This article mainly focuses on the accurate analysis of human body,including the research of human parsing and human pose estimation.The main research work and innovation achievements are as follows.Human parsing aims to segment a human image into multiple parts with fine-grained semantics and provides more detailed understanding of image contents.When the human body posture is complicated,the existing human parsing methods are easy to cause misjudgment to the human limb components,and the segmentation of the small target is not accurate enough.In order to solve the above problems,a double-branch network jointing posture prior is proposed for accurate human parsing.The model first uses the backbone network to acquire the characteristics of the human body image,and then uses the pose prior information predicted by the human pose estimation model as the attention information to form a multi-scale feature expression driven by the human body structure prior.The multi-scale features are fed into the fully convolution network parsing branch and detection parsing branch separately.The fully convolutional network obtains global segmentation results,and the detection parsing branch pays more attention to the detection and segmentation of small-scale targets.The segmentation results of the two branches are fused to obtain the final parsing result,which can be more accurate.The experiment results on LIP and ATR datasets verify the effectiveness of the proposed algorithm.As mentioned above,the human body pose estimation has a greater correlation with the human parsing.In order to improve the accuracy of the model,most of the existing researches are designing complex deep network models,and only focus on the task of pose estimation or human parsing.In order to be more suitable for practical applications,while considering accuracy and running speed,this paper proposes a lightweight model that combines human parsing and pose estimation.This model can simultaneously segment human body parts and locate key points of bones.First,the lightweight pose estimation model and lightweight human parsing model proposed in this paper are used to extract features suitable for the corresponding task.Second,the features extracted by the two models are combined and the optimized network is used to obtain accurate segmentation and pose results,respectively.When designing the lightweight human pose estimation model,a lightweight bottleneck structure is proposed,which greatly reduced the amount of parameters.Finally,based on the lightweight model proposed in this paper,a prototype system for human parsing and pose estimation is designed. |