Font Size: a A A

Cross-media Retrieval Between Images And Texts

Posted on:2018-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y H JiaFull Text:PDF
GTID:2428330623450607Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of internet,information technology,digital technology and the universal use of various types of sensors,the cross-media data that incorporates images,texts,audios and videos data is booming.Cross-media data,especially the widespread coexistence of images and textual data,provides a base for both the short term to solve cross-media retrieval problems and for the long term,enabling computers to communicate with humans in human languages.Based on the new requirements of information retrieval in the era of cross-media big data,this paper explores the solutions to the cross-media retrieval tasks.Based on the locality-sensitive hash algorithm and neural network,a data reduction algorithm is proposed to reduce the proportion of irrelevant data of the query.Based on the traditional methods and the deep learning theory,two cross-media retrieval methods are proposed respectively.Firstly,based on the embedding methods,we propose multi-label kernel canonical correlation analysis(ml-KCCA),a novel approach for cross-modal retrieval which extends kernel CCA with high level semantic information reflected in multi-label annotations.Secondly,based on the view that the image feature map extracted by the convolution model can be regarded as the feature of segment semantic of image,a fine granularity convolution fusion network(CFN)is proposed.The main contributions of this paper can be summarized as follows:(1)To tackle the problem that the data set often contains loads of content that is completely irrelevant to the query,a data reduction algorithm is proposed to significantly improve the proportion of relevant documents in the data set in this paper based on the local sensitive hash algorithm and the neural network algorithm.(2)Based on the traditional methods,ml-KCCA is proposed by kernelizing correlation extraction from multi-label information,in which more complex non-linear correlations between different modalities can be measured in order to learn a discriminative subspace which is more suitable for cross-media retrieval tasks.(3)In view of the majority of the deep learning based cross-media retrieval methods,the text and image modeling process are completely independent,without any interaction until the final representation is generated respectively to carry on the correlation analysis,CFN is proposed.By allowing independent and specialized fragmental feature representations to be leveraged for each modality like image or text,the proposed method is flexible in interlinking the intermediate fragmental features to generate a joint abstraction of two modalities,which provides better matching scores.
Keywords/Search Tags:Cross-media retrieval, data reduction, KCCA, Deep learning, CFN
PDF Full Text Request
Related items