Research On Multi-modal Learning Based On Shared Subspace

Posted on:2021-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Min

Full Text:PDF

GTID:2518306467976409

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of the Internet,there are more and more high-dimensional data in all walks of life at every moment and human beings are entering the era of Big Data.The technology of acquiring multi-modal data and the representation ability of multi-modal data of the same object make the application of multi-modal data more and more common in the real world.Compared with the one-modal data,multi-modal data can complement each other from multiple perspectives,and improve the accuracy of machine learning models.Therefore,multi-modal learning is attracting more and more attention from researchers in today's world.By collecting data describing different modals of the same thing,we can record the thing better detailed.The real-world data often exists in the form of multiple modals.The information of different modals complements each other and can represent the data more effectively.Multi-modal data often have heterogeneous features at the bottom and semantic correlation at the top.How to effectively integrate the information of different modals is the key to multi-modal learning.In this paper,from the perspective of shared subspace learning,we study the shared subspace representation of multi-modal data and fuse the multi-modal information to solve the problems such as "modal gap" of multi-modal data,modal importance specificity,modal instance missing,etc.The main innovations are as follows:(1)To solve the problem of "modal gap" in multi-modal data,an adversarial fusion unsupervised cross modal hashing retrieval algorithm based on deep network(Deep Unsupervised Adversarial Cross-modal Hashing,DUACH)is proposed.Multi-modal data is fused by adversarial learning,and multi-modal data is fused into the shared subspace.DUACH designs an inter-modal discriminator to make the hidden layer representation distribution of multi-modal data consistent as much as possible,while the two intra-modal discriminators retain the overall structure and local structure of image and text modals.DUACH can not only fusion of multi-modal information,but also retain the local substructure information of multi-modal data.The effectiveness of the proposed model is verified by cross-modal retrieval experiments on multiple image-text bimodal datasets.(2)In order to deal with the partial multi-modal data,an adaptive partial multi-modal shared subspace learning model(Partial Multi-view Clustering via Auto-weighted Similarity Completion,PMVC-ASC)is proposed.PMVC-ASC learns a consistent similarity matrix through complete multi-modal data,at the same time,the weights of different modals almost do not need parameters,and can be automatically updated iteratively.And then we fill the similarity matrix corresponding to the missing multi-modal data by mining the internal association between the missing data and the complete data.The multi-modal data is divided into corresponding clusters by mining the high-order correlation of the graph through the graph model.At the same time,an iterative optimization algorithm is proposed to optimize the partial multi-modal algorithm.Experiments on multiple datasets show that the clustering effect of the proposed PMVC-ASC is better than that of the latest partial multi-modal clustering algorithms.

Keywords/Search Tags:

multi-modal learning, shared subspace learning, multi-modal clustering, cross-modal retrieval

PDF Full Text Request

Related items

1	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
2	Research On Social-Sensed Cross-Modal Retrieval
3	Incomplete Cross-modal Clustering Analysis
4	Cross-modal Retrieval And Annotation Based On Hashing Learning Method
5	Research On Theories And Methods Of Shared Subspace Representation Learning For Multi-view Data
6	Multi-modal Learning Based On Single-modal And Multi-modal Data
7	Research On Classification And Retrieval Techniques For Multi-Modal Data
8	Cross-modal Video Retrieval Algorithm Based On Multi-semantic Clues And Metric Learning
9	Multi-view Neural Network Learning Approaches For Cross-modal Retrieval And Classification
10	Research Of Cross-modal Retrieval Methods Based On Deep Learning