Font Size: a A A

Research On Human Semantic Segmentation Based On Deep Learning

Posted on:2022-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:T J XiangFull Text:PDF
GTID:2518306524492634Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Human semantic segmentation is a fine-grained semantic segmentation task that aims to identify the components of human images(e.g.,body parts and clothes)at the pixel-level scale.Understanding the content of human images is useful for a number of potential applications such as e-commerce,human-computer interaction,image editing,and virtual reality.Currently,significant progress has been made in human semantic segmentation with the development of full convolutional neural networks based on semantic segmentation.Compared with general image segmentation,the difficulties of human semantic segmentation are mainly in the following aspects: firstly,human semantic segmentation is more complex in the instance scene data,involving a variety of scenarios,such as multiple people or people making some complex actions,which can affect the training accuracy;secondly,human semantic segmentation is a very fine segmentation task,which leads to its semantic boundaries during segmentation This also increases the difficulty of human semantic segmentation.In order to accommodate the growing number of engineering applications of human semantic segmentation,the research on semantic segmentation for analyzing full convolutional networks is becoming more and more active.In this thesis,we take human semantic segmentation in real scenarios as the research topic and deep learning as the method,and the main research is divided into the following parts.Human semantic segmentation based on full convolutional network with encoder-decoder structure.As the most important network structure in the field of semantic segmentation,the full convolutional network solves the shortcomings of the traditional convolutional neural network in the segmentation method.In this thesis,based on the advantages of transpose convolution and hop-level structure of the full convolutional network itself,an encoder-decoder network is designed to optimize the full convolutional network by referring to SegNet,and the translation invariance is maintained by maximum pooling to achieve better robustness.Through experiments,this thesis demonstrates that the full convolutional network with encoder decoder structure has huge advantages over the general convolutional neural network-based image segmentation methods.Research on human semantic segmentation based on contextual knowledge integration.To address the two drawbacks of full convolutional networks,namely insufficient refinement of results and lack of spatial consistency,this thesis uses a method based on contextual knowledge integration for optimization.In this thesis,two methods,conditional random field and dilation convolution,are used for multi-scale optimization of the network.The traditional semantic segmentation model is improved using null space convolutional pooling,which can capture multi-scale information more effectively.The experimental results in this thesis show that the proposed model has significantly improved over other versions of the training network in benchmark tests of human semantic image segmentation on different datasets.
Keywords/Search Tags:semantic segmentation, convolutional neural network, human semantic segmentation, conditional random field, Atrous convolution
PDF Full Text Request
Related items