Font Size: a A A

Intelligent Decision Making In Intensive Care Units Based On Reinforcement Learning

Posted on:2020-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhaoFull Text:PDF
GTID:2404330596982449Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the aging of populations,the shortage of medical resources is becoming more and more serious.At the same time,with the development of artificial intelligence(AI),AI techniques have been widely used in real-life application including medical treatment.As one of the core technologies of artificial intelligence currently,the application of reinforcement learning(RL)in intelligent decision-making in Intensive Care Units(ICU)has also made great progress,but it still has certain limitations.In order to effectively use RL methods,it is necessary to construct a reasonable reward function in advance,which relies heavily on the personal subjective experience of the clinicians and thus lacks generalization.In addition,since the goal of traditional RL algorithms is to maximize the long-term reward value,the exploration behavior in the learning process may have a fatal impact on the patient.This thesis mainly studies the optimal decision-making problem of the ventilation and the dose of sedative in ICUs based on RL.When patients are treated in ICUs,they are often accompanied by invasive ventilation and sedatives.Premature extubation can lead to reintubation,which is not conducive to restoring health,while prolonging unnecessary ventilation can lead to increase the risk of complications and higher hospital cost.Therefore,proper ventilation and the dose of sedative are critical to reducing the mortality of patients in ICUs.Targeting at the main problems of RL in medical application currently,this thesis mainly has the following contributions.(1)Using Bayesian inverse RL method,the reward value function corresponding to mechanical ventilation and the sedative dose in ICUs was inferred from the case of the expert physician-guided treatment,so that the learned strategy is closer to the expert physician's strategy.The experimental results show that the Bayesian inverse RL algorithm is better than the simple learning algorithm with artificially designed reward value function in matching doctor strategy and convergence stability.(2)A supervised RL algorithm is proposed,which combines the long-term goal-oriented characteristics of RL and the minimization of difference-oriented characteristics of supervised learning.It is applied to the medical problem of mechanical ventilation and the sedative dose in ICUs.With the long-term goal of curing patients,the degree of difference between the policy learned and medical policy from physician is reduced,and the treatment effect is improved.The experimental results show that the supervised actor-critic(AC)algorithm is slightly better than the vanilla AC algorithm in the degree of matching the physician policy,but the convergence speed and data utilization efficiency are much higher than the vanilla AC algorithm.
Keywords/Search Tags:Reinforcement Learning, Inverse Reinforcement Learning, Supervised Reinforcement Learning, Intensive Care Units, Mechanical Ventilation, Sedative Dosing
PDF Full Text Request
Related items