Font Size: a A A

Study Of Multi-view Embedding Learning Techniques With Applications

Posted on:2018-05-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:X B ShenFull Text:PDF
GTID:1318330542990546Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Recent years have witnessed the explosive growth of multi-view data,due to the rapid development of data collection and storage techniques.Multi-view data has multiple views,with each view characterizing the object from one perspective.Therefore,multi-view data is able to reflect distinctive properties of the same object.Although multi-view data can provide more rich information than single-view data.,how to effectively analyze it has been recently a very hot yet challenging research topic.Generally speaking,multi-view data often lies in a high-dimensional space,and multiple views are highly correlated.It is significant to explore the latent low-dimensional embedding shared by multiple views,which contributes to improve ability of learning models and reduce the computational complexity of learning algorithms simultaneously.Based on the above consideration,this thesis focuses on multi-view data,and studies the multi-view embedding techniques.By incorporating hashing technique,we develop several multi-view learning methods,and show their applications on the real-world tasks,such as large-scale visual retrieval,image classification,and automatic annotation.The main contributions of this thesis are as follows:(1)Most cross-view hashing methods are developed by assuming that data from differ-ent views are well paired,e.g.,text-image pairs.In real-world applications,however,this fully-paired multi-view setting may not be practical.This paper studies a more practical yet challenging semi-paired cross-view retrieval problem,where pairwise correspondences are only partially provided.The proposed Semi-paired Discrete Hashing(SPDH)explores the underlying structure of the constructed common la-tent subspace,where both paired and unpaired samples are well aligned.To ef-fectively preserve the similarities of semi-paired data in the latent subspace,we construct the cross-view similarity graph with the help of anchor data pairs.SPDH jointly learns the latent features and hash codes with a factorization-based coding scheme.For the formulated objective function,we devise an efficient alternating optimization algorithm,where the key binary code learning problem is solved in a bit-by-bit manner with each bit generated with a closed-form solution.Extensive evaluations in fully-paired and semi-paired settings demonstrate the effectiveness of SPDH in large-scale cross-view retrieval.(2)How to efficiently integrate multiple views for learning compact hash codes still re-mains challenging.In this paper,we propose a novel unsupervised hashing method,dubbed multi-view discrete hashing(MvDH),by effectively exploring multi-view data.Specifically,MvDH performs matrix factorization to generate the hash codes as the latent representations shared by multiple views,during which spectral clus-tering is performed simultaneously.The joint learning of hash codes and cluster labels enables that MvDH can generate more discriminative hash codes,which are optimal for classification.Considering the discrete nature of hashing,the key binary code learning problem is solved in a bit-by-bit manner with each bit generated with a closed-form solution.Extensive experiments demonstrate the superiorities of the proposed method in terms of both accuracy and scalability.(3)Most embedding methods ignore the correlations between the input and output,such that their learned embeddings are not well aligned,which leads to degradation in prediction performance.In this paper,we formulate multi-label learning from the perspective of cross-view learning,to explore the correlations between the input and output.The proposed method,named Co-Embedding(CoE),first regards the input and output as two views,then jointly learns the semantic common subspace and the view-specific mappings within one framework.The semantic similarity structure among the embeddings is further preserved,ensuring that close embeddings share similar labels.CoE conducts multi-label prediction based on the cross-view kNN search among the learned embeddings to significantly reduce computational cost when compared to conventional decoding schemes.Moreover,based on CoE,a hashing based model,i.e.,Co-Hashing(CoH),is further proposed,which imposes the binary constraint on continuous latent embeddings.CoH aims to generate compact binary representations,and improves the prediction efficiency,benefiting from the efficient kNN search of multiple labels in the Hamming space.Extensive experiments demonstrate the superiority of the proposed methods in terms of both prediction accuracy and efficiency.(4)We first propose a unified multiset canonical correlation analysis framework based on graph embedding for dimensionality reduction(GbMCC-DR).Based on the e-quivalent formulation of MCCA,GbMCC-DR is able to characterize the correlation from the perspective of graph structure.Three novel supervised correlation anal-ysis methods are developed under GbMCC-DR by introducing several supervised graphs.We theoretically show that GbMCC-DR unifies several existing methods,such as MCCA,PLS,LPCCA,MCCA.Furthermore,we propose a semi-supervised canonical correlation analysis based on label propagation(LPbSCCA)for multi-view semi-supervised setting.LPbSCCA uses a sparse representation based label propagation scheme to infer label information for unlabeled data,constructs prob-abilistic within-view within-class scatter matrices and inter-view correlations,and finally establishes the model by maximizing the inter-view correlations and mini-mizing within-class variances simultaneously.A general method called LPbSMCCA is developed to deal with data with arbitrary views.Extensive experiments on sev-eral datasets demonstrate that the effectiveness of the proposed GbMCC-DR and LPbSCCA.
Keywords/Search Tags:Machine Learning, Multi-view Learning, Multi-view Embedding, Hash Code Learning, Canonical Correlation Analysis, Multimedia Retrieval, Nearest Neighborhood Search, Multi-label Prediction
PDF Full Text Request
Related items