Font Size: a A A

Study On The Model Of Action Selection Based On Reinforcement Learning

Posted on:2022-05-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H ZhangFull Text:PDF
GTID:1480306569470754Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Action selection is an essential decision-making activity of the brain,while the basal ganglia(BG)is critical for the coordination of several motor,cognitive,and emotional functions.Constructing different types of BG computational models and analyzing them at different levels is an important approach for us to understand the function of basal ganglia circuits.Computational models of biophysical circuits capture the essential characteristics of biological systems,allowing us to explore their physiological processes and dynamical behaviors.Abstract models constrained by functional principles are built to help understand the kinds of computations that might lead to cognitive processes,such as learning,action selection,and even cognitive control.In this paper,different types of calculation models of basal ganglia circuits are constructed to explore the dynamic properties of basal ganglia circuits and related cognitive decision-making functions.The specific work is as follows:In Chapter 1,we introduce the preliminary knowledge about biological neural circuits modeling and three classical computational models of basal ganglia circuit modeling.We first present the neurons' anatomical structure and their mathematical model;ion channels and their mathematical model;synaptic connections and synaptic plasticity.The neuronal model includes the integrate-and-fire model,Hodgkin-Huxley model,Rall cable model and multi-compartment model.Then we give the basal ganglia's main structure and introduce the three classical computational models about basal ganglia circuit: the action selection model,the reinforcement learning(RL)model,and the Actor-Critic model.In Chapter 2,we introduce the overview of reinforcement learning and the basic theories of reinforcement learning.We first present two classical methods(value iteration and policy iteration)for finding the optimal policy in dynamic programming and their convergence proof.Then we introduce one metric of efficiency of RL algorithm,i.e.,sample-complexity,and give the sample-complexity minimax bound of optimal policy.Finally,we introduce policy gradient method and policy gradient theorem.In Chapter 3,we study the effect of the topological structure changes of medium spiny neurons on the signal regulation function of basal ganglia circuits.First,we establish a multi-compartment model of medium spiny neurons based on biological anatomical experiments,and analyze the effect of the location of removed dendritic spines on the discharge activity of the soma of medium spiny neurons.Then,based on the anatomical structure of the basal ganglia circuit,the cortex-basal ganglia-thalamus-cortex loop is constructed by using the conductance-based neuron model.Based on the computational model of this loop,we investigate the effects of the loss of dendritic spines and the degradation of dendritic trees on the signal regulation function of the basal ganglia.The results show that a proper proportion of the loss of dendritic spines or the degeneration of dendritic trees can restore the normal regulatory function of the basal ganglia.Finally,we investigate the regulatory effect of cortical discharge activity on the circuit.In Chapter 4,we construct a decision-making model of reinforcement learning and use it to prove that episodic memory can accelerate learning.we construct an ActorCritic framework based on RL theories in prefrontal cortex-basal ganglia systems and RL algorithms for artificial systems.The Actor-Critic framework was modeled by recurrent neural networks.This framework was trained for two classical decision tasks: random dots motion direction discrimination task and value-based economic choice task.The trained model is capable of reproducing some features of neural activities recorded from the animal brain,or some behavior properties exhibited in animal experiments.Furthermore,we conduct behavioral experiments on our framework,trying to explore which episodic memory in the hippocampus should be selected to ultimately govern future decisions.We find that the retrieval of salient events sampled from episodic memories can effectively shorten deliberation time than common events in the decision-making process.The results indicate that salient events stored in the hippocampus can be prioritized to propagate reward information,and thus allow decision-makers to learn a strategy faster.
Keywords/Search Tags:Medium spiny neurons, Basal ganglia, Prefrontal cortex, Neural circuit, Recurrent neural network, Reinforcement learning, Action selection, Decision-making, Episodic memory
PDF Full Text Request
Related items