Font Size: a A A

Research On Image-text Cross-modal Hashing Retrieval Method With Large Batch Training

Posted on:2021-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhouFull Text:PDF
GTID:2428330614459253Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and multimedia technology,a large number of multimedia data with different modalities have been generated,cross-modal retrieval has become a research hotspot in the field of information retrieval.Cross-modal hashing methods can effectively establish the comparison relationship for the data of different modalities.The hashing methods convert the data into fixed-length hash codes.And the similarity between the data can be quickly obtained through the XOR bitwise operation of hash codes.With the development of deep learning,more and more cross-modal hashing methods based on deep learning have been proposed.However,most of these methods use a small batch size to train their model.When training a model in small batch size,the loss function cannot get a good gradient because of the limited number of samples in each batch,which affects the retrieval performance of the final trained model.To solve the problem,this paper proposed a cross-modal hashing method based on large batch training.This method uses large batch training to train the model to obtain a better gradient.However,simply increasing the batch size will cause the training unstable and degraded generalization performance of the model.In order to solve this problem,this paper introduces orthogonal regularization to increase the stability of large batch training and improve the generalization ability of the model.And to consider the discreteness of hash codes,the distance between hash codes and features is added to the objective function which makes hash codes to represent data more realistically.On two widely used datasets in cross-modal hashing,this method is compared with several existing cross-modal hashing methods.The experimental results show that the proposed method has better performance.On the other hand,many cross-modal hashing methods only considered the relationship between the data of inter-modal,while ignoring the relationship between the data of intra-modal that contains rich information which can increase the discrimination of the hash codes.To solve the problem,based on the cross-modal hash method proposed above,this paper attempts to use a quintuplet method to input data,and consider the relationship between various data of both intra-modal and inter-modal to train the model.Experiments on three widely used datasets in cross-modal hashing show that this method can further improve cross-modal retrieval performance.At the same time,the sensitivity analysis was carried out on the hyperparameters of the model to clarify the influence of relevant parameter settings on the model training and experimental results.
Keywords/Search Tags:cross-modal hashing, large batch training, orthogonal regularization, quintuplet
PDF Full Text Request
Related items