| Continuous process and batch process are two major production modes in chemical production processes,where the continuous process is suitable for manufacturing large quantities of chemical products with low precision requirements,like petrochemical process;the batch process is aimed to produce small quantities of fine chemical products with high precision requirements and high value,such as pharmaceutical process,plastic injection,semi-conductor production and so on.The batch process usually performs different operations on different stages to achieve various processing purposes,which makes the batch process have complex characteristics encluding complex system dynamics and numerous operating variables.Because of these characteristics,the control system is required to have fast responseability,excellent robustness,high control accuracy,good stability and so on.To satisfy these requirements,iterative learning control algorithms and reinforcement learning algorithms are the two most potential algorithms to meet these control requirements.As for iterative learning control(ILC),it is designed based on the repeatability of the batch process,and it can converge fast along with the batch direction.What’s more,ILC can control systems without knowing the system dynamics.Therefore,ILC plays a vital role in controlling batch processes.When using iterative learning control,the batch process must ensure that the initial control state,the process dynamic and the process goals are consistent.However,in industrial processes,it cannot guarantee these consistency so that ILC can’t provide good robustness.Therefore,improving the robustness of ILC is an urgent research in the batch process.As for reinforcement learning(RL),it is also a data-driven algorithm based on learning ideas.Compared with ILC,the RL has a better robustness.The application scenarios of reinforcement learning are multi-tasking processes with complex dynamic characteristics,which are also belong to batch processes.Therefore,it is a natural idea to implement reinforcement learning into the batch process.However,RL needs a long training period,and the industrial processes cannot provide an ideal training environment.Besides,RL is designed based on trial and error,whichg makes the exploration space contains bad control strategies.Therefore,how to improve the learning efficiency of reinforcement learning and ensure the safety of its exploration space is an urgent research in the batch process.In order to improve the robustness of ILC and the learning efficiency of RL,this thesis proposes an algorithm scheme named ’Iterative Learning Control Guided Reinforcement Learning Control(ILC-RLC)’ to solve these problems.This scheme runs ILC and RL in a batch process simultaneously;then,it uses the control message of the previous batch to calculate the ILC current control input;finally,it uses the ILC control input to guide RL learning efficiently.What’s important,it can use RL to improve the robustness of the controller.However,this scheme cannot make full use of the control message provided by ILC.Out of this reason,we further propose an algorithm named ’Learning from ILC Demonstration(ILC-LFD)’.This scheme uses the control data provided by ILC to give out the optimization direction of the strategy network and value network so that the RL can get a stable control strategy quickly.In ILC-LFD scheme,the neural networks of RL are mainly updated based on the data provided by ILC so that RL can quickly learn an effective control strategy in the early training stage.With the increase of interactive data of RL,the data used to update neural networks are mainly consisted of data explored by RL so that it can improve the robustness.In this paper,we first run simulations on a linear system to demonstrate the control ability of ILC-RLC and ILC-LFD.Then,we run simulations on two batch processes to prove that the our methods can control the industrial process,and the simulation illustrates that the our methods provide effective paradigms for the application of reinforcement learning. |