Font Size: a A A

Research On Underactuated System Control Based On Improved Reinforcement Learning

Posted on:2022-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:2518306737456414Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Underactuated system which is a kind of system has independent control whose dimension is less than the degree of freedom of the system.Its essence is a nonlinear system,and its characteristic is that the dimension of the input space is less than the dimension of the construction space.Compared with the full drive system,it has the advantages of simple structure,energy saving,lower cost and higher system flexibility.The under-actuated system has a simple structure,which is convenient for overall system analysis and simulation experiments,and is convenient for research and verification of the effectiveness of various algorithms.Recent decades,underactuated systems have became a hot topic in control science field.Reinforcement learning,which is one important method of machine learning,focuses on solving sequential optimization and decision-making problems,has been widely used in robot control,artificial intelligence and multi-agent systems.Deep reinforcement learning(DRL)absorbs deep learning's ability of perception and the decision-making ability of reinforcement learning,which can be directly controlled based on the input image.It is an AI method which is similar to the way when human think.However,in the agent training process,the agent needs a lot of data to form an optimal strategy,but different data play different roles in the training process.Therefore,how to make full use of training data to maximize the training efficiency of the agent is a key issue.A deep reinforcement learning algorithms is applied to underactuated control system in this paper.Below follows the main work of this paper:First of all,in order to improve the efficiency of the agent,a control method based on an improved deep learning strategy gradient algorithm is proposed.The reinforcement learning system is composed of a strategy neural network and a baseline function network.The neural network activation function adopts new discoveries in recent years.Neural network takes the new Swish function as its activation function,also a baseline function neural network which has the same function is added.It avoids the problem that nonlinear neural networks often have difficulty in convergence,and it is easy to overfit to the environment in the process of learning the dynamics of the environment.Secondly,in the agent training process,because the state space is too large,the corresponding effective reward information is too small.As a result,learning is slow or even impossible to learn.Aiming at the sparse reward problem of reinforcement learning,a new reinforcement learning algorithm based on cluster analysis and improved experience pool is proposed.By applying the classic cluster analysis method to the continuous state space,different clusters are used to separate the state space,and an exploratory degree function is constructed as an additional bonus.Add this extra reward to the original basic reward,and then use the compound reward to train the agent.By introducing an improved experience pool into the traditional reinforcement learning algorithm,directly using sparse reward samples for learning can directly improve the strategy's ability to explore the environment and avoid the training divergence of the neural network.The performance of the reinforcement learning algorithm is generally improved.Finally,apply the new algorithm to the inverted pendulum control system and compare the new algorithm with the classic control algorithm.The results of simulation experiment shows that the new algorithm in this paper is more effective than other classic control algorithm.
Keywords/Search Tags:Underactuated nonlinear system, reinforcement learning, inverted pendulum, activation function, cluster analysis
PDF Full Text Request
Related items