| Inspired by the recently proven effective method of generating interpretable control factors based on the nervous system architecture of the earthworm.C.elegans,we present the flight control of a quadcopter.Although contemporary deep learning algorithm have achieved noteworthy success in various form of high dimensional task,their learned causal structure interpretability,robustness and the non-greed data are largely overlooked.This dissertation presents methods to address interpretability,the none-greedy training data,and the overlooked properties of class of intelligent algorithm resulting from random search algorithm combined with the worm neural network in quadcopter control as a continuous time environment.The main research content is divided into following parts:Firstly,we introduced the Caenorhabditis elegans worm,(C.elegans).We therefore present the comprehensive map of neural connections in the C.elegans brain as a wiring diagram.We then showed that several possibilities for research exist with the use of this animal.However,we also show that according to the literature two uses of the C.elegans worm neuron networks are possible,namely the use of its entire connectome with the 302 neurons,and the use of the circuit responsible for the forward and backward movement of the worm called tap.withdrawal circuit.Therefore,we have moved in the direction of using the neural network worm in control of systems.Second,we present the mathematical model behind the worm’s neural network.Given that the worm’s neural network is the basis of a new instance of RNN formulated by computer model developed to explain small species.This mathematical model is called Liquid Time Constant(LTC)because they have nonlinear components in the description of internal dynamics.We have that LTCs form a dynamic causal model capable of learning capable of learning the causal relationship between the input,their neural state,the output dynamics.Thirdly,we have an introduction on the quadcopter control showing his state model.Then after we have shown a comparison of the reward performance in previous work about the quadcopter flight control.This knowledge was important for the comparison with our results.Fourthly,we explained the mapping of the quadcopter to the Tap Withdrawal(TW)circuit.The quadcopter consists of 24 inputs and 4 outputs,and in fact,to control the quadcopter flight we control the 4 rotors so then we add a linear layer that maps the 24 number of input variables to two continuous variables that are then fed into the four sensory neurons of the Tap withdrawal circuit and we extend the motor neuron to 4 and map the neuron potentials of the 4 motor neurons to the 4 control outputs.Finally,the flight behavior of the agent by combination of augmented random search and worm neural network is compared the performance of the used of augmented random search algorithm and also with the performance of the latest reinforcement learning algorithm,called Deep Deterministic policy gradient(DDPG).And our simulations and results show that a high variability in performance is perceived in ordinary used sampling strategies.The experimental findings show that the algorithm suggested in this dissertation has higher accuracy.With the introduction of the worm neural network in the control of the flight control we have shown how the neuronal activities of trained neuronal policies could be observed,this simple neural network with only 13 neurons and 28 synapses can react relatively accurately to provide a better performing alternative among reinforcement learning strategies. |