Font Size: a A A

Matrix Factorization And Adversarial Learning For Multimedia Retrieval

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:S Y HeFull Text:PDF
GTID:2518306524489774Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Various forms of multimedia data(e.g.image,text,video and audio)have shown explosive growth with the development of big data in recent years.However,accurate and effective retrieval of large-scale multimedia data has become a new research hotspot and difficulty.In this thesis,a series of more advanced methods are proposed to improve the retrieval accuracy for large-scale multimedia retrieval.These methods can be classified as single-modal retrieval and cross-modal retrieval.1)single-modal retrieval in this thesis is mainly the study of image samples which aims to retrieve the image similar to the target image from the massive image dataset by using the approximate nearest neighbor search method.In this thesis,we propose a novel unsupervised hashing scheme for image retrieval.The novel unsupervised hashing approach,termed Bidirectional Discrete Matrix Factorization Hashing(BDMFH),which alternates two mutually promoted processes of a)learning binary codes from data;b)re-covering data from binary codes.In particular,we design the inverse factorization model,which enforces the learned binary codes inheriting intrinsic structure from the original visual data.Comprehensive experimental results on three large-scale benchmark datasets show that the proposed BDMFH not only significantly outperforms the state-of-the-arts but also provides satisfactory computational efficiency.2)Cross-modal retrieval aims at enabling flexible retrieval across different modali-ties.The core of cross-modal retrieval is to learn projections for different modalities and make instances in the learned common subspace comparable to each other.In this the-sis,we present two novel methods based on adversarial learning,called Self-Supervised Adversarial Learning(SSAL)and Category Alignment Adversarial Learning(CAAL)for cross-modal retrieval.SSAL deploys self-supervised learning and adversarial learning to seek an effective common subspace.CAAL aims to find a common representation space supervised by categories information,in which the samples from different modalities can be compared directly.Comprehensive experimental results on several widely-used bench-mark datasets show that the proposed methods is superior in cross-modal retrieval and significantly outperforms the existing cross-modal retrieval methods.
Keywords/Search Tags:Large scale multimedia retrieval, Matrix factorization, Hashing, Adversarial learning, Self-supervised learning, Modality gap
PDF Full Text Request
Related items