Font Size: a A A

Research On Key Technologies For Zero-Shot Learning

Posted on:2019-07-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y N LiFull Text:PDF
GTID:1368330548477404Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Supervised learning,as one of the most basic tasks in machine learning,has been widely concerned by various research fields such as computer vision,natural language processing and speech recognition.Moreover,with the development of deep learning technology in recent years,the performance of supervised learning has witnessed tremendous improvement,especially in the field of object recognition,whose performance far exceeds human recognition ability.However,in generaly,supervised learning techniques usually need to collect hundreds or even thousands of labeled training samples for each target class in advance,which seriously hinders the further devel-opment of supervised learning.Concerning this issue,many solutions in machine learning have been developed and studied so far,which are usually characterized by limited training data.One motivation behind these different types of learning is that they move machine learning solutions closer to the sort of learning humans are capable of,making possibly small but significant steps towards artificial intelligence.While,zero-shot learning plays a significant role in the achievement of this most challenging goal.It aims to solve learning tasks(also known as unseen classes)that are completely absent of lableled traning data,thus making the machine learning system have the ability of continuous learning.Therefore,zero-shot learning is gradually becoming a hot topic in many research fields.For these above reasons,this thesis mainly studies the fundamental key technologies in zero-shot learning,and uses the visual recognition task as a verification example.After analyzing these key technologies in depth,this thesis solves some serious challenges such as the knowledge transfer mechanism and domain shift problem,that facing current zero-shot learning,whose performance is thus greatly improved.In summary,the main contribution of this theis is listed as follows:.For the first time,this thesis summarize the current progress of zero-shot learning techni-cally,compare the similarities and differences between zero-shot learning and other machine learning techniques,and formally describes the basic technical route of zero-shot learning,that is,though shared semantic embedding space to transfer knowledge from seen classes to unseen clases.Accordly,four key technologies are summarized,namely image semantic feature extraction,semantic embedding space construction,visual-semantic mapping model and unseen class label prediction.Some problems such as theoretical explanation on the knowledge transfer mechanism,domain shift problem,manifold defects in the semantic em-bedding space still need to be fully explored.Understanding then solving these problems has a very important guiding significance for desining new models and new algorithms..To attack the problem of knowledge transfer and domain shift,a general inductive zero-shot learning algorithm based on relational knowledge transfer is proposed.From the perspective of spatial geometry,the method reveals the role that the relational knowledge between seen class and unseen classes in the manifold plays in the knowledge transfer mechanism.And for the first time,the knowledge is tranferred reversely from semantic embedding space to image feature space,in order to generate virtual data and restore the missing manifold of unseen classes.In addition to its simplicity and generality,its efficiency is also remarkable,shown by results on multiple real world datasets..Aiming at the problem of manifold defect in the semantic embedding space,a transductive zero-shot learning algorithm based on manifold alignment is proposed.In essence,to a cer-tain extent,it is aligning the manifolds in both image feature space and semantic embedding space that is the goal of the visual-semantic mapping.While at the same time,the more con-sistent manifold in the semantic embedding space can improve its generalization ability on unseen classes.In the proposed algorithm,by using the local manifold structure of testing data and alternatively optimizing between the visual-semantic mapping and the semantic embedding space,the purpose of manifold alignment can be gradually achieved,thereby improving zero-shot learning performance step by step.Experimental results on real world datasets show that this method has great advantages in computation speed,scalability and performance.
Keywords/Search Tags:zero-shot learning, spatial geometry, manifold structure, knowledge transfer, relational knowledge, manifold alignment, inductive zero-shot learning method, transductive zeroshot learning method
PDF Full Text Request
Related items