Font Size: a A A

End-To-End Active Tracking System Via Deep Reinforcement Learning

Posted on:2022-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:J X ChenFull Text:PDF
GTID:2518306572459744Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The active tracking system can actively and purposefully adjust camera parameters to achieve target tracking.It is widely used in large-scale intelligent monitoring and large-scale moving target tracking.How to obtain target parameters and achieve optimal control in specific application scenarios is still a major difficulty.The use of scientific and technological means to assist sports training is becoming a hot spot and trend in my country's sports scientific research work,especially in winter sports,it is urgent to use science and technology to improve training quality and achieve leapfrog development.Short track speed skating is a typical sports event that highly combines sliding technology and competition tactics.Video recording of the whole process of training or competition is the key basic information for training quality evaluation and competition strategy optimization.Starting from the visual tracking requirements of the short track speed skating project,this paper designs an end-to-end active tracker,which simplifies the design of the control system;selects the appropriate reinforcement learning training algorithm according to the task scenario;for the high cost of interactive training of reinforcement learning Question,carried out research on inverse reinforcement learning method based on generative confrontation model;and established a skating rink simulation environment that meets the needs of algorithm training,and realized the research and verification of the above methods in the simulation environment.The specific work is as follows:For short track speed skating According to the tracking environment and requirements,combined with the current end-to-end control method,an end-to-end active tracker is designed to input camera images and output camera control signals.Analyze and select the asynchronous advantage reinforcement learning algorithm to train the active tracker,and conduct experiments in the simulation environment to verify the rationality and effectiveness of the active tracker design.The single target tracking task tracking success rate can reach 89.1%,and then change The simulation environment background verifies that the active tracker can still complete the tracking in a changing background,and has strong robustness.Aiming at the high cost of interactive training of reinforcement learning algorithms,an inverse reinforcement learning method is introduced to reduce training time.Discussed and analyzed the theoretical defects of current generative adversarial reinforcement learning,combined with Wasserstein distance to improve generative adversarial reinforcement learning,expounded its mathematical principles,and designed the corresponding network model;experimented with the improved algorithm in a simulation environment,To verify the effectiveness and stability of the algorithm training.Generative adversarial reinforcement learning based on Wasserstein distance can reduce the reinforcement learning interactive training by about 50%,and the training process is more stable.The subject proposes a generative adversarial behavioral cloning method based on Wasserstein distance,which can learn expert strategies without interacting with the environment;experiment with the algorithm in a simulation environment to verify its learning ability;use reinforcement learning algorithms to train behavioral cloning The tracker is retrained to verify its ability to accelerate reinforcement learning training.The retrained tracker can learn a convergent tracking strategy with very little interactive training.
Keywords/Search Tags:End to End, Deep Reinforcement Learning, Imitation Learning, Generative Adversarial Networks
PDF Full Text Request
Related items