Font Size: a A A

Research On Autonomous Driving On Highway Roads Based On Reinforcement Learning And Vehicle Dynamics

Posted on:2015-08-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:C M LiuFull Text:PDF
GTID:1220330479479613Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In the domain of highway autonomous driving, due to the real vehicle test has the difficulties of dangers, expenses, long cycles and achieving special conditions, so the simulations of vehicle dynamics model in the highway environment is essential. Because the vehicle autonomous driving control and decision-making problem is a sequential decision problem with large-scale and continuous space, and with multiple optimization objectives, it is difficult to solve this problem by the traditional dynamic programming method. Hence, so the way of machine learning to solve this problem has become an important research direction for autonomous driving. This research work is under the National Natural Science Foundation project "kernel-based reinforcement learning and Approximate Dynamic Programming Method", "hierarchical reinforcement learning in virtual human motion planning" and "highway vehicle intelligent driving key scientific research support, research for solving large-scale and continuous space, as well as multiple optimization objectives reinforcement learning algorithms and theory, highway in vehicle dynamics simulation modeling methods as well as to enhance the autonomous driving learning method on the highway roads.The main innovations and research results are as following:1. For reinforcement learning problem with continuous state space, the KLSPI algorithm based on relative sparse error, the KLSPI algorithm base on reordering sparse, the KLSPI algorithm based on rapid sparse and the KLSPI algorithm based on active learning were proposed. The KLSPI algorithm based on relative sparse error and the KLSPI algorithm base on reordering sparse aimed to improve the performance of original KLSPI algorithm by enhancing the value function approximation ability; the KLSPI algorithm based on rapid sparse was to calculate the new feature vector approximation error in advance in the original feature space to reduce the time cost of original KLSPI algorithm; the KLSPI algorithm based on active learning wass to accelerate the process of learning convergence algorithm by embedding the initiative of sample collection strategies. Corresponding simulation results verify the validity and rationality of the proposed method.2. For reinforcement learning problem with continuous action space, the CAPI algorithm was proposed, in which the fast policy search method was applied to search for optimal actions in continuous space after the value function was approximated by the TD learning method. This method was with high computational efficiency and it was easy to be implemented. In order to improve the generalization ability and learning efficiency of CAPI, an adaptive basis function selection method was proposed with the effective application of the linear function approximation and kernel function action-value function approximation. Corresponding simulation experiments verified the effectiveness of the proposed method.3. In order to improve value function approximation in reinforcement learning, this paper proposed a novel HAPI algorithm based on binary-tree space decomposition with the thinking of structure. By improving local policy, this method made the combined global policy strategies will be better than the original MDP algorithm using the API directly. Two learning control problems show that HAPI methods can obtain more preferably approximate optimal policy than API algorithm, under the same conditions of samples and base functions.4. For reinforcement learning problem with multiple objectives, the basic framework solving MORL problem and MORL algorithm based on the sequential and weights were proposed. This algorithm can overcome the drawbacks of MORL algorithm based on the sequential and that based on the weights. The simulation of Deep-sea-treasure verified this algorithm can not only select the solutions located in the concave Pareto frontier region, but also it can optimize the weighted sum of all optimization goals.5. Combined with a unified tire model, road model, the mass-spring-damper model, and the dynamic model of the data-driven approach, a vehicle dynamics model in highway environment was created. This model provided a convenient and effective simulation tool for studying intelligent vehicle driving technology in the highway environment. The results of real vehicle tests with HQ430(model testing) show that consistent simulation model data with real vehicle test data, so this model can reflect the dynamic characteristics of the actual vehicle. The longitudinal and lateral simulation accuracy was given at last.6. In the course of lane changing on the highway roads, conditions to avoid a collision is studied, and the driving characteristics of the ambient vehicle is analysed, and the front minimum safe spaces and backward minimum safe spaces are obtained by established HQ430 dynamics simulation model, also was the total minimum safe spaces in HQ430 lane change process. The minimum safe spaces are important foundations for automous safe driving and security judgment and warning analysis.7. The theories and methods of reinforcement learning, as well as the simulation of vehicle dynamics model were applied to as highway intelligent driving on highway roads. The experiments of straights, corners and multi-lane driving show that reinforcement learning method can obtain better decision-making policy, so it provides important basis for the real intelligent driving decisions.
Keywords/Search Tags:autonomous vehicles, autonomous driving, reinforcement learning, Markov decision process, vehicle dynamics simulation, drive into flow, minimum safe space
PDF Full Text Request
Related items