Font Size: a A A

Research On Multi-modal Similarity Learning

Posted on:2018-01-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:X J GaoFull Text:PDF
GTID:1318330542461941Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Similarity learning is a type of machine learning algorithm that aims at automatically and accurately computing relevance between objects.It has been widely used in many artificial intelligence applications,such as information retrieval,multi-media and knowledge graph completion.However,conventional similarity learning algorithms usually based on single-modal,which lack of express ability and the interpretability of their parameters.In this thesis,these problems are addressed from two different perspectives.One perspective is from the input data point of view and propose an unsupervised representation learning algorithm.The other perspective is from the similarity modelling point of view and propose supervised multi-modal similarity learning algorithms:(1)From input data point of view,a novel local voting based multi-view embedding algorithm first be proposed to mine the complementary information within multi-view features and preserves local structures.(2)Then from input data point of view,the multi-view embedding is used in order to guide conventional deep learning algorithms for further representation learning.This method not only adds physical or statistical information of given data to conventional deep learning algorithms and also prevent these algorithms from overfitting.(3)From model point of view,the proposed method defines different relation vectors.Each relation vector represents one perspective of given data and the relevance between given objects is computed under different relations.Moreover,in proposed method,the flat relation structure is extended to hierarchical structure by taken into account the semantic structure,which enhance the interpretability of parameters.(4)In addition,from model point of view,attention mechanism and external memoryare added to further improve the performance of similarity learning.Attention mechanism,which can shift its focus to more relevant content according to different requirements,is similar to human perceptual system for paying attention.External memory,which can use learned information to help current task solving,is similar to human learning procedure for storing memories.(5)Finally,the unsupervised representation learning and supervised similarity learning are combined in a deep architecture and fine-tunes all the parameters for a better global results.Effectiveness of the proposed method is demonstrated through evaluations based on different image retrieval tasks and compared against various state-of-the-art algorithms in the field.Demonstration of output examples shows the capability of proposed methods and indicate some relation between machine intelligence and human vision understanding.
Keywords/Search Tags:Multi-modal Similarity Learning, Representation Learning, Hierarchical Semantic Relation, Attention Mechanism, External Memory, Deep Learning
PDF Full Text Request
Related items