| With the rapid development of the Internet and clothing e-commerce,the number of online clothing images has increased dramatically.Automated and accurate retrieval of massive clothing images is a very meaningful work from both commercial and academic perspectives.Currently,clothing image retrieval techniques are mainly divided into text-based and content-based methods.Text-based methods rely too much on textual annotations of images.At the same time,due to differences in language and expression habits,the retrieval effect is greatly limited.At present,many content-based clothing retrieval methods have achieved high accuracy on same-domain datasets,but are less effective on cross-domain datasets.However,in actual use,there is a big difference between the images taken by users and the standard images taken by stores,which can be regarded as coming from two different heterogeneous domains.In view of the lack of accuracy of feature extraction in the current cross-domain clothing retrieval task,this paper proposes a cross-domain clothing retrieval method combining mixed attention and feature fusion based on the theory of deep learning.The main work is as follows:(1)Propose a cross-domain clothing retrieval method RST-EAM combined with mixed attention.Local detail features such as patterns and patterns in clothing images are very important,and invalid backgrounds and lighting will interfere with feature extraction.To solve this problem,based on the strong baseline of the residual network RST,a mixed attention module is added to each residual block in the backbone feature extraction network.This module is a lightweight module that establishes the importance relationship between features on both the channel axis and the spatial axis,completes the importance distribution of features,suppresses the expression of background noise,improves the attention to local detail features,and obtains More efficient features for cross-domain clothing retrieval.Afterwards,the method in this paper is compared with other attention methods and other cross-domain clothing retrieval methods in the Consumer-to-shop clothing retrieval subset of the Deep Fashion clothing dataset to verify the superiority of the method.(2)A cross-domain clothing retrieval method EAN-DOLG based on orthogonal feature fusion is proposed.The deep network can extract representative feature descriptions,but weakens the low-level detail information and the middle-level style information.To solve this problem,based on the RST-EAM feature extraction network,a feature fusion module is introduced,which uses multi-scale atrous convolution and simple self-attention to extract representative local information,and uses generalized mean pooling to extract global information.And aggregate local information and global information to generate feature representations that are more suitable for cross-domain image retrieval.At the same time,the combined loss function of triple loss,center loss,classification loss and centroid loss is used to constrain the training process,and the centroid loss is used in the retrieval stage to shorten the retrieval time.The method in this paper has achieved good retrieval performance in the Deep Fashion dataset.Experimental results show that orthogonal feature fusion can improve the accuracy of clothing retrieval. |