Font Size: a A A

Transfer Reinforcement Learning Control Based On Causal Modeling

Posted on:2024-04-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y W SunFull Text:PDF
GTID:1528307364467844Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
The ultimate goal of artificial intelligence(AI)is to build intelligent machines that think and make decisions like humans.As an essential data-driven control method,reinforcement learning(RL)is considered the technical cornerstone of general AI.Deep reinforcement learning(DRL)methods,which combine the representation ability of deep learning and the decisionmaking ability of RL,are widely used in complex control tasks while faced with some limitations.First,the interpretability is poor.DRL tends to focus only on the correlation and is trained end-to-end.However,the modules in the control system are interrelated.Thus it lacks an understanding of the causal effect and cannot provide an interpretable decision-making pipeline.Second,the data efficiency is poor.The satisfiable control performance requires a large number of interactions with the environment.In real-world applications,the agent may be faced with high sampling costs,and the amount of samples is hard to meet the requirements for training.Moreover,the lack of data analysis hinders the algorithm from fully using the samples.Third,the generalization is poor.Most RL algorithms require the assumption of independent and identical distribution(i.i.d).When faced with the distribution shift or task migration,the i.i.d assumption will be destroyed,and the generalization of the strategy cannot be guaranteed.The performance of RL will become poor,leading to unexpected catastrophes.In addition,the neural network-based algorithm may cause spurious correlation issues,making it difficult to provide a basis for algorithm generalization.The drawbacks above limit the application and development of DRL.In recent years,the causality technique has shown its potential and advantages in the RL community.Beyond the inherent capability of inferring causal structure from data,causality provides an explainable toolset for investigating how a system would react to an intervention.Quantifying the effects of interventions allows explainable decisions and policy evaluation in the complex system.Although the causal-based RL theoretically has the abovementioned superiorities,many practical issues appear when extending the approach to real-world applications.For example,how to build causal models for different types of data,how to use causal inference to make full use of data under limited samples,and how to provide the transfer basis in heterogeneous scenarios.In this thesis,based on causal technology,we comprehensively introduce how to embed causal mechanisms in the control system and realize cross-domain generalization.The proposed transfer RL control algorithm based on causal modeling can solve interpretability,data efficiency,and generalization issues.The contributions of this thesis are summarized as follows:1.To solve the interpretability issue,a causal graph-based transfer RL algorithm is proposed to identify distribution shifts in a causal manner in the nonstationary or heterogeneous system.Our goal is to build an RL framework integrating causal modeling and policy updating,discover the causal relations among variables and improve control performance.With the help of causal auxiliary variables and the structural causal model,the proposed algorithm provides model transfer interpretability across domains.The relations among variables are inferred in a causal view and explicitly shown.Then,an auxiliary variable-based neural network is designed to guide how to realize transfer.Finally,the policy training and few-shot transfer are realized in the model-based RL framework.2.To improve data efficiency,a counterfactual-based transfer RL algorithm is proposed in both stationary and heterogeneous systems.We aim to realize data augmentation based on causal inference and extend the causal modeling to a more general case.First,the neural network-based Granger causality is utilized to identify the causal relations among variables.Then,a framework based on the improved generative adversarial network is proposed to estimate the transferable environment model.Finally,the counterfactual inference is used to augment the dataset,paving the way for policy warm-start under both stationary and heterogeneous environments.3.For the nonlinear mixture system,an algorithm is proposed to recover the temporally causal latent processes from general temporal data,providing a theoretical guarantee for the following research.We aim to recover time-delayed latent causal variables and identify their relations from measured temporal data.Both a nonparametric,nonstationary setting and a parametric setting are considered for the latent processes,and two provable conditions are proposed under which temporally causal latent processes can be identified from their nonlinear mixtures.A theoretically-grounded framework is proposed that extends variational autoencoders by enforcing our conditions through proper constraints in the causal process prior.4.To solve the generalization issue,a causal disentangled transfer RL algorithm is proposed in the input-redundant system.Our goal is to disentangle the high-dimensional redundant observation into unchanged effective signal and shift background signal,and design a transferable controller based on low-dimensional causal representations.First,the proposed algorithm realizes disentanglement based on the improved variational autoencoder framework.Then,the causal representation from the effective signal is discovered based on the dynamic mapping.Finally,a transferable controller is trained in the source domain using the representation.Since the underlying causal structure does not change across domains,the policy network can be directly transferred to the target domain and realize zero-shot transfer.
Keywords/Search Tags:reinforcement learning, transfer learning, causal graph, counterfactual inference, representation learning
PDF Full Text Request
Related items