Clothing parsing has received more and more attention due to its wide application in fashion synthesis,pose estimation,etc.With the rapid development of convolutional neural networks,existing clothing parsing methods have also achieved excellent performance with the help of convolutional neural network technology,which greatly promotes the research and application of clothing parsing task technology.However,due to the characteristics of changeable human body shape,diverse clothing categories,large differences in target object size,and blurred object edges,the existing clothing image parsing methods are prone to erroneous parsing results,especially in some similar and small clothing categories.At the same time,the convolutional neural network itself also has the problem that repeated downsampling operations lead to a significant reduction in image resolution and the loss of a large amount of spatial information.Therefore,to improve the accuracy of clothing image parsing results,the main researches in this paper are as follows:(1)We proposed a clothing image parsing algorithm based on multi-scale fusion enhancement.To recover the spatial information lost during the downsampling process of the convolutional neural network,we design a fusion enhancement module to fuse the features of different levels during the decoding process and uses different receptive fields to capture the multi-scale information of the fused features.At the same time,we use the channel attention mechanism to obtain global context information and use it as a weight to select the importance of multi-scale features.Finally,we further enhance the expression ability of features by concatenating the output features of multiple fusion enhancement modules,and upsampling the concatenated results to obtain the final analysis results.(2)We proposed a dual-branch network composed of two branches of multi-level feature fusion and edge detection.Firstly,we use a lighter feature fusion module in the multi-level feature fusion branch to fuse feature maps of different scales,so as to enrich the semantic information of features while keeping the parameters unchanged.Secondly,we design an edge detection branch,which uses different scale feature maps to extract fine edge information to make up for the lost detail information in the down sampling process.Finally,we perform upsampling fusion on the results of the two branches to obtain the final analytical results.(3)We proposed a multi-scale attention-guided clothing image parsing algorithm.Based on the idea of attentional mechanism,we proposed an attention guidance module to guide features to learn discriminative features after capturing global context information.At the same time,we use an additional loss function in the attention guidance module to supervise the learning process of features,so as to accurately identify similar or some small and easily ignored target objects.Then,we design a multi-scale fusion strategy to gradually fuse the output attention features from multiple attention guidance modules.Finally,we linearly interpolate the fused features to the original image size as the final prediction result.We conducte experiments on Fashion Clothing and LIP datasets.A large number of comparative experimental results show that our algorithm can not only effectively improve the parsing results of clothing images,but also achieve excellent parsing results compared with other advanced parsing algorithms.The proposed model can be applied to practical application scenarios such as garment recognition,intelligent wardrobe and pose estimation. |