Font Size: a A A

Research On Human Parsing Algorithm Based On Deep Learning

Posted on:2023-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:X K ZhangFull Text:PDF
GTID:2558306845999059Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human parsing is a relatively basic computer vision task with significant research significance and a wide range of applications,such as video surveillance,clothing recognition,and human-computer interaction,thus receiving more and more attention.Due to the development of deep learning,there has been a great breakthrough in the research of human parsing.However,in the actual application scenes,there are various clothes,postures and backgrounds,which bring great challenges to the human parsing task and make the parsing results appear blurred boundaries between parsing regions and semantic inconsistency,which seriously affect the accuracy of human parsing.In this paper,based on the deep learning method,the research on human parsing algorithm is carried out to address the above problems,and the main research work is as follows.First,an algorithm for human parsing based on category context and edge detection is proposed in the paper.The network is divided into a parsing branch and an edge detection branch.A class context module is introduced in the parsing branch because existing methods use a single spatial strategy to extract context without distinguishing the contribution of different classes of pixels to the context.The class context is based on the global context feature,which different responses to contexts depending on the class of pixels.The edge detection branch mainly performs coarse edge prediction and edge refinement to obtain edge features with strong characterization capability,which are combined with the parsing features of the parsing branch to provide edge constraints and alleviate the problem of boundary confusion among human body parts.The experimental results show that the method achieves higher evaluation metrics and better visualization results.Second,an algorithm for human parsing based on structural information is proposed.A graph semantic aggregation module is constructed in the network to explicitly model the internal topology of the human body and extract the features of each node at each level.In order to learn the existence of specific associations between human body parts,another relationship inference module with node features as input is used.The graph semantic aggregation module and the relational inference module together complete the extraction of the intrinsic structural information of human body.In addition,simple and efficient deconvolution is used to construct human pose estimation branches for learning the external structure information generated by human pose.The network closely combines the two kinds of structural information,which can ensure the correct association between parts and make the parsing results reasonable in the face of complex and variable human postures.Finally,a human parsing network based on non-local feature fusion and graph convolution correction is proposed.In the study of human parsing algorithms,assisting human parsing task by other tasks is an effective way to improve the parsing accuracy.However,such approaches do not focus on the need for effective fusion of features from different tasks,so the non-local feature fusion module is designed which is inspired by the idea of attention mechanism to keep the consistency of edge features and parsing features.Further,the pixels in the image are regarded as nodes in the undirected graph,and the graph convolution correction module is designed to learn the relationship between the node and optimize the results of the previous step.The final experimental results have shown that the method performs well in both quantitative and qualitative aspects.
Keywords/Search Tags:Deep Learning, Human Parsing, Edge Detection, Pose Estimation, Attention Mechanism, Graph Convolution
PDF Full Text Request
Related items