Font Size: a A A

Small Sample Computer Vision Problems

Posted on:2019-11-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:W XuFull Text:PDF
GTID:1368330590470367Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of deep learning,computer vision technology has been improved significantly and has been widely employed in many areas,such as,entertainment,surveillance,industry control,etc.This huge improvement is mainly due to the powerfulness of the supervised deep learning mode,CNN(Convolution Neural Network).However only with the large amount of labelled training data set,the effectiveness of CNN can be guaranteed.In our daily life,most of our computer vision problems do not have large labelled training data set,and it is very expensive to collect large labelled training data set.Therefore,focusing on the small sample computer vision problems is urgent.The computer vision problems are often complicated and their original data are of high dimension.In order to better represent the samples,their features are usually of high dimension as well.However,training mode/ predictor with small samples of high dimension tends to overfit the model/ predictor.Hence,in this paper,we try to design several robust model/ predictor for the small sample problems in the computer vision area.The input of the predictor for the computer vision problems often consists of samples,labels and the numbers of the tasks(if it is the multi-task problems).Therefore,in this paper,we try to design robust model/ predictor from the following views and their combinations: the learning orders of the samples,the relations between the tasks,the representation of the samples and the way to enlarge data set automatically.The details are as follows:In chapter 2,from the views of the learning orders of the samples,we propose a new learning framework,named Self-Paced Learning with Privileged Information(SPL+).Self-paced learning(SPL)is a powerful framework for several tasks,where data from easy ones to more complex ones are gradually involved in the learning process.However,SPL is unable to exploit prior knowledge,so it is prone to overfitting.To alleviate this problem,we propose a framework called self-paced learning with privileged information(SPL+),where privileged information is introduced as prior knowledge to guide the curriculum learned by SPL.Specifically,the learning process using weighted privileged information and the curriculum learning process guided by privileged information are iteratively performed until the final mature curriculum guided by privileged information is learned.As this curriculum learning process can gradually grasp the easy to hard knowledge under the guidance of the robust high level privileged information,a more reliable model can be learned.Moreover,our SPL+ is a generalized framework,which is applicable to various problems.Comprehensive experiments demonstrate that our SPL+ outperforms the conventional SPL based method for three applications including action recognition,scene recognition and handwritten digit recognition.In chapter 3,from the views of the relations between the tasks,we propose a novel multi-task classification framework,called Multi-Task classification with Sequential Instances and Tasks(MTSIT).Different from previous works,which treat all tasks and instances equally,MTSIT is inspired by the cognitive process of human brain that often learns from easier tasks to harder tasks.Specifically,the method attempts to jointly learn the task curriculum(learning order of tasks)and the instance curriculum(learning order of instances)by introducing a self-paced item for the instances of each task in the existing multi-task learning framework Sequential Multi-Task learning(SeqMT),which transfers information from the previously learned tasks to the next ones through shared task parameters.To effectively solve MTSIT,we also propose an optimization algorithm in which the instance curriculum and the task curriculum alternate between two paradigms,Tasks-to-Instances and Instances-to-Tasks(TIIT).In the tasks-to-instances step,the learner conducts the instance curriculum when the task curriculum has been fixed,while in the instances-to-tasks step,the task curriculum is learned when the instance curriculum in each task has been settled down.Our TIIT method is based on an error bound of the proposed MTSIT.Experimental results on three real world data sets demonstrate the effectiveness of our method.In chapter 4,from the views of the the representation of the samples,we propose a novel image classification algorithm,called Multi-modal Self-Paced Learning for image classification(MSPL)?Self-paced learning(SPL)is a powerful framework,where samples from easy ones to more complex ones are gradually involved in the learning process.Its superiority is significant when dealing with challenging vision tasks,like natural image classification.However,SPL based image classification can not deal with information from multiple modalities.As images are usually characterized by visual feature descriptors from multiple modalities,only exploiting one of them may lose some complementary information from other modalities.To overcome the above problem,we propose a multi-modal self-paced learning(MSPL)framework for image classification which jointly trains SPL and multi-modal learning into one framework.Specifically,the multi-modal learning process with curriculum information and the curriculum learning process with multi-modal information are iteratively performed until the final mature multi-modal curriculum is learned.As this multi-modal curriculum can grasp the easy to hard knowledge from both the sample level and the modality level,a better model can be learned.Experimental results on four real-world datasets demonstrate the effectiveness of the proposed approach.In chapter 5,from the views of enlarging data set automatically,we propose a novel single image de-raining algorithm,called Image De-raining via a Cycle-Consistent Adversarial Networks(IDCCGAN).Different from current deep learning based method,our IDCCGAN do not require the large amount rainy and background paired data sets.In addition,a new cycle-consistency content preservation loss is introduced into our IDCCGAN,which makes our model generated de-raning images perform more robust on the traditional vision tasks.Extensive experimental results on the synatic and real data sets demonstrate the effectiveness of our IDCCGAN framework.
Keywords/Search Tags:small sample, self-paced learning, learning with privileged information, multi-task learning, multi-modal learning, Generative adversarial network
PDF Full Text Request
Related items