Font Size: a A A

Imitation Learning Based On Generative Adversarial Nets With Multiple Kinds Of Demonstrations

Posted on:2020-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:J H LinFull Text:PDF
GTID:2428330578979393Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the field of artificial intelligence has paid more and more attention to how to learn decision models that are similar to or even better than human.Imitation learning is a feasible method to solve decision-making problems.Imitation learning refers to learning from expert decision data to obtain decision models that close to expert.Generative adversarial imitation learning(GAIL)is an emerging imitation learning method.It achieves better robustness,representation capability and computation efficiency,and is able to handle complicated,large-scale problems and applicable in realistic tasks.However,GAIL has strong limitations on the assumption of expert demonstrations.It assumes that the expert sample is simplex and perfect.Due to the different preferences of individual experts and the possibility of making error,this assumption is difficult to be satisfied in practical application.In order to extend GAIL to more practical applications,this paper relaxes the limitation on the assumption of demonstrations,and proposes two imitation learning method based on GAIL with multiple kinds of demonstrations.The main research includes the following two parts:i.Generative adversarial imitation learning with auxiliary classifier is proposed.To deal with the situation where there are multiple kinds of demonstrations,this research adds an auxiliary classifier to the original generative adversarial imitation learning method,and proposes the algorithm of generative adversarial imitation learning with auxiliary classifier.The experimental results on the simulation environment show that the algorithm is able to learn the category of the demonstrations by leveraging the auxiliary classifier.Thus it can achieve imitation learning with multiple kinds of demonstrations.Moreover,eomparing to an existing unsupervised method,it achieves better accuracy and effectiveness.ii.Generative adversarial imitation learning with failure demonstrations.The existence of failure demonstrations is a special situation of multiple kinds of expert demonstrations.To solve this problem,this research proposes to eonstruct a memory pool to store and roll back failed samples,and reuses the failed samples by means of resampling.On this basis,a training algorithm based on the generative adversarial imitation learning with failure demonstrations is proposed.By reusing failure demonstrations,the method not only achieves better action success rates than experts,but also improves sample efficiency.Experiments show that this method can deal with the special multi-class sample imitation learning problem of expert samples with both successful and failed samples.
Keywords/Search Tags:Imitation learning, Generative adversarial nets, Multiple kinds of demonstrations, Failure demonstrations
PDF Full Text Request
Related items