Font Size: a A A

One-Shot Generalization In Deep Learning For Simple Visual Concept

Posted on:2019-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y H QiuFull Text:PDF
GTID:2428330548461161Subject:Engineering
Abstract/Summary:PDF Full Text Request
This paper hopes to simulate human imagination--human beings can use an example or little examples to reason about new concepts and distinguish it and imagine a similar concept of it.For example,it is still possible to identify and infer the concept of the river and associate it with the corresponding image when people see a river which is never seen before in tourism.whats more,the same concept also can be rapidly deduced and be write to similar characters when people see a new type of symbol.Traditional algorithms not only require manual intervention to assign human visual rules to the machine and add more constraints to the model,which are faster to achieve results and need additional manual labor,but also require big data to learn.The prevalence of big data leads a situation that the larger the amount of data,the better the effect is,which has become almost the law of deep learning.Although remarkable achievements have been made,the learning model takes a lot of time and a large number of data samples and consumes a lot of resources.In order to reduce the workload of manual annotation and reduce the time of learning the model and the data needed in training model,more and more people are beginning to pay attention to unsupervised learning algorithms and small data algorithms and think how to let AI automatically infer new concepts and get information from a small amount of known data.Most algorithms migrate knowledge of the same type of data and mine information from the data to increase the data information exponentially.Even if the number of training samples is small,the algorithm can be trained to good effect which is not worse than training mass data.For example,HPBL algorithmwhich is based on probability program Learning can infer the corresponding concept information and generate a number of different forms of the same category of characters by seeing a picture at a glance and identify category of the characters.It's disadvantage is requiring the extra stroke information,which let it hard to be applied in reality.The CDRAW algorithm which is based on deep generation model only need common pixel images,but the complexity of the model is hundreds of times higher than HPBL which let the learning model takes too much time to train.This paper summarizes recent algorithms with respect to small data learning and image generalization and proposes a combination model:dual conditional deep generation model--2CVAE algorithm or CGAN-CVAE algorithm.The space transformation model is replaced by the conditional generation model(CVAE--conditional variational auto-encoder or CGAN--conditional generative adversarial networks)and make the image which is transformed by the condition generation model as the conditional input of CVAE to realize the spatial transformation of the data under the constraints of human visual category to make the training data samples increasing.The model realize the one-to-many outputs of pictures to pictures in a small number of sample data sets and a large number of sample data sets and shows that using this simple model can complete the weak single point generalization learning task.CVAE which is one of the components extracts the features of the conditional image and the features of the VAE image through dual channels to achieve the effect of one-to-many image generation.Whats more,the input of VAE in CVAE get the improvement by the multi-task learning which is proposed in this paper and the training time and generating effect of it are better than the original VAE.This model is tested on MNIST datasets with large amount of sample data and Fashion-MNIST datasets with complex images.It has a good generation effect in the one-to-many outputs of pictures to picture,whichshows the algorithm is good in complex datasets and large datasets.And it can generate character pictures well and complete the task whice is weak one-shot generalization in the Omniglot datasets with very few samples and simulate the visual function of human association.
Keywords/Search Tags:one-shot generalization, deep generation model, variational auto-encoder, generative adversarial networks
PDF Full Text Request
Related items