Font Size: a A A

Research On Multi-modal Learning Based On Shared Subspace

Posted on:2021-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:C MinFull Text:PDF
GTID:2518306467976409Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,there are more and more high-dimensional data in all walks of life at every moment and human beings are entering the era of Big Data.The technology of acquiring multi-modal data and the representation ability of multi-modal data of the same object make the application of multi-modal data more and more common in the real world.Compared with the one-modal data,multi-modal data can complement each other from multiple perspectives,and improve the accuracy of machine learning models.Therefore,multi-modal learning is attracting more and more attention from researchers in today's world.By collecting data describing different modals of the same thing,we can record the thing better detailed.The real-world data often exists in the form of multiple modals.The information of different modals complements each other and can represent the data more effectively.Multi-modal data often have heterogeneous features at the bottom and semantic correlation at the top.How to effectively integrate the information of different modals is the key to multi-modal learning.In this paper,from the perspective of shared subspace learning,we study the shared subspace representation of multi-modal data and fuse the multi-modal information to solve the problems such as "modal gap" of multi-modal data,modal importance specificity,modal instance missing,etc.The main innovations are as follows:(1)To solve the problem of "modal gap" in multi-modal data,an adversarial fusion unsupervised cross modal hashing retrieval algorithm based on deep network(Deep Unsupervised Adversarial Cross-modal Hashing,DUACH)is proposed.Multi-modal data is fused by adversarial learning,and multi-modal data is fused into the shared subspace.DUACH designs an inter-modal discriminator to make the hidden layer representation distribution of multi-modal data consistent as much as possible,while the two intra-modal discriminators retain the overall structure and local structure of image and text modals.DUACH can not only fusion of multi-modal information,but also retain the local substructure information of multi-modal data.The effectiveness of the proposed model is verified by cross-modal retrieval experiments on multiple image-text bimodal datasets.(2)In order to deal with the partial multi-modal data,an adaptive partial multi-modal shared subspace learning model(Partial Multi-view Clustering via Auto-weighted Similarity Completion,PMVC-ASC)is proposed.PMVC-ASC learns a consistent similarity matrix through complete multi-modal data,at the same time,the weights of different modals almost do not need parameters,and can be automatically updated iteratively.And then we fill the similarity matrix corresponding to the missing multi-modal data by mining the internal association between the missing data and the complete data.The multi-modal data is divided into corresponding clusters by mining the high-order correlation of the graph through the graph model.At the same time,an iterative optimization algorithm is proposed to optimize the partial multi-modal algorithm.Experiments on multiple datasets show that the clustering effect of the proposed PMVC-ASC is better than that of the latest partial multi-modal clustering algorithms.
Keywords/Search Tags:multi-modal learning, shared subspace learning, multi-modal clustering, cross-modal retrieval
PDF Full Text Request
Related items