Font Size: a A A

Semantic Transfer Hashing Based On Deep Learning For Cross-modal Retrieval

Posted on:2019-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y W FengFull Text:PDF
GTID:2428330572458940Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,multimedia data appears on various platforms,including online shopping,social networking sites,search engines and video web sites,the retrieval problem of multimedia data becomes more and more prominent,various methods have been developed for the retrieval of multimedia data,and hashing has been widely studied and applied for cross-modal retrieval due to its low storage cost and fast query speed.cross-modal retrieval is one of the important forms of multimedia data retrieval,and it is about rapidly and precisely retrieving data of one modal(text,voice,images,etc.)from the data of other modal,the hash method map the data from high dimension space into hamming space,and the original data is replaced by a compact string of binary code,so the cross-modal retrieval can be completed in the hamming space.The traditional cross-modal hashing method maps the handcrafted data features into hash codes and completes the cross-modal retrieval in hamming space.The process of getting the hash code is not matching the process of feature extraction,so the neighborhood relationship of the hash code getting from this process will not match that of the original data,thus the final retrieval effect is not satisfactory.The deep learning method merges the process of feature extraction and the process of getting hash code in one deep framework,better matching the neighborhood relationship information of the original data,thus the final retrieval results also get substantial increase accordingly.But on the one hand existing deep learning based method is just about extracting data features and then getting the hash code from the last layer in the network,so it's lacking of semantic information of data,on the other hand the heterogeneous relations exist between two modal data,learning the hash code of the two data at the same time will inevitably produce different distribution,so the retrieval effect will be affected.In view of the above problems,we propose two methods,and they are semantic hashing based on deep learning for cross-modal retrieval and semantic transfer hashing based on deep learning for cross-modal retrieval.On the one hand,we fused the process of feature extraction and hash code learning in one deep framework,using the supervision of the training data information and the self-supervised method to train an artificial neural network,we study the semantic hash code and features with this network.The learning hash code and features with semantic will be helpful for training the deep cross-modal hashing framework,and we can effectively embed the semantic information into the model with predicting crossentropy loss and pairwise loss.On the other hand,we can use the domain adaptation method to further alleviate the problem of heterogeneous semantic gap across modal,after we embedding the semantic information into the model,the retrieval performance can be improved,but it needs to be improved to solve the modal gap problem.The semantic features share the same neighborhood relationship with image features and text features,so they have relationship in distribution,the semantic features and text features have more similar distributions due to the corresponding nets are similar,the semantic features distribution and image features distribution can be different,but they share the same neighborhood relationship.We set the semantic features as source domain,and set the image and text features as two target domains,then normalized handle the image,text and semantic features,so they will be more similar,finally adopt the domain adaptation method to decrease the differences between their distributions and the cross-modal semantic gap problem can be alleviated effectively.After a large number of experiments,we get more effective experimental results for crossmodal retrieval with our methods applied in the datasets NUS-WIDE,IAPR TC12 and MIRFLICKR-25 K,when compared with other methods and the methods based on deep learning,it shows that our methods are effective.
Keywords/Search Tags:cross-modal retrieval, learning to hash, deep learning, semantic information, domain adaptation
PDF Full Text Request
Related items