Font Size: a A A

Research On Commodity Image Retrieval Method Based On Cross-modal Technology

Posted on:2022-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:B X QiaoFull Text:PDF
GTID:2518306341471434Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the continuous growth of multimedia data,the complexity of information description is also increasing.Information with the same semantic is often described from multiple perspectives and data types.How to extract feature information from different modes and the problem of feature information association modeling has become one of the focuses of researchers.At the same time,in the field of e-commerce,people's requirements for commodity retrieval are further improved.User's description and understanding of the same product image has distinct personal characteristics,and different users have different understanding of the same semantic description.To provide users with cross-modal commodity image retrieval services can better meet the needs of users.In order to realize cross-modal retrieval,it is necessary to extract features from different modal data.Aiming at the two modes of image and text involved in this study,the common feature extraction methods are used to extract the two modal features of image and text,and the image text feature space is established.When constructing the objective function,cross-modal retrieval methods based on subspace learning often focus on the projection distance of different modal data with the same semantics in the subspace to make it as small as possible.However,in the process of data training,these methods often ignore the training of different semantic and modal data,so the retrieval accuracy is not high.To solve this problem,a cross-modal retrieval method based on improved subspace learning is proposed.Firstly,a new objective function is constructed by introducing the part of training the projection distance of different semantic and modal data in the subspace,and then the intra class distance of different modal data in the common subspace is reduced as much as possible,At the same time,the principle of reducing the distance between classes is used to optimize the objective function.Finally,the performance of text features and image features is evaluated through comparative experiments,and the effectiveness of the improved subspace learning cross-modal retrieval method is verified.In this paper,the experimental results of different image features and text features in the proposed method are compared by using the method of comparative experiment.At the same time,the proposed method is compared with the cross-modal correlation propagation method,the nearest neighbor based heterogeneous similarity measurement method and the subspace learning method of joint feature selection on the public data set Wikipedia and the self built product data set.The experimental results show that the proposed method works well when the gradient direction histogram feature is used in the image and the document frequency inverse document frequency feature is used in the text;Under the same experimental environment,the average accuracy of the proposed method is 8.65%,9.88%and 0.62%higher than those of the above three methods,respectively;The average accuracy of text retrieval image is 9.33%,9.81%and 0.68%higher than the above three methods respectively.Experimental results verify the effectiveness of the proposed method.
Keywords/Search Tags:Cross-modal retrieval, subspace learning, similarity measurement
PDF Full Text Request
Related items