Research On Generative Model Based World Model Establishment And Intelligent Decision-making Algorithm

Posted on:2021-02-03

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Zhang

Full Text:PDF

GTID:2428330614450176

Subject:Mechanical and electrical engineering

Abstract/Summary:

PDF Full Text Request

Intelligent decision-making has always been one of the key technologies of robots.The current application of robots is facing the challenge of shifting from structured factory environments and tasks to complex daily life environments such as household,office buildings,roads,and fields,and more challenging tasks.This shift puts a higher demand on the intelligence of robots.The existing intelligent algorithm heavily relies on the researchers to model the environment and the body in advance and always needs to re-model when facing new problems.Therefore,the algorithm can not generilize and costs huge labor,which cannot meet the large demand for intelligence in the future industry.This subject aims to explore the general description and solutions of intelligent decision-making problems.First,based on the commonly used POMDP process in reinforcement learning,a general mathematical description of intelligent decision-making problems is established,and by analyzing it,the intelligent decision-making algorithm is equivalent to the extraction and utilization of information.We use information theory to analyze the distribution of information in the environment,and finally get a general framework for solving intelligent decision problems based on the concept of the world model.The world model is divided into two processes: perception abstraction and state prediction according to the type of information extracted.Based on the Mo Jo Co simulation platform,five typical visual control tasks are selected as the verification platform for this paper.Secondly,the relationship between the perceptual abstraction process and the generative model is derived.The internal constraints in the POMDP process are used to transform the perceptual abstraction process into a generation problem,and the perceptual abstraction process is implemented based on the variational autoencoder.The information constraints in the optimization objective is analysised from a theoretical viewpoint,and use the flow model to replace the prior distribution of the variational autoencoder to achieve better static information extraction.The methods' ability to extract static information of the environment was verified on two typical tasks.Thirdly,the relationship between the state prediction process and the generation model is derived.The internal constraints in the POMDP process are used to transform the state prediction process into a sequence generation problem,and the state predictionprocess is implemented based on the recurrent neural network.Three models of RAR,RVAR,and RVAE are proposed according to the different node forms of the belief variables and the trajectory optimization methods.The methods' ability to ability to extract and predict environmental dynamic information on two typical tasks.Finally,mimicing the human intelligent decision-making method,an imaginative learning method based on the world model and the actor-critic framework is proposed,and the learned world model is used to generate human-like,interpretable intelligent decisions.A systematic verification of the framework is conducted in all five typical simulation tasks.And the results shows the effectiveness of the algorithm,and compared with other reinforcement learning methods can greatly improve the sample efficiency.We also explore training agent in offline mode,which proves that the feedback process of data collection in the framework is crucial to the performance of the agent.

Keywords/Search Tags:

World Model, Generative Model, Reinforcement Learning, Deep Learning, Representation Learning, Intelligent Decision-Making

PDF Full Text Request

Related items

1	Research And Application Of Decision-making Model For Video Games Based On Deep Reinforcement Learning
2	Reinforcement Learning-based Intelligent Decision-making Methods For Unmanned Vehicles
3	A Non-complete Information Intelligent Game Decision-making Method Based On A3C Model
4	Research On Multi-Agent Deep Reinforcement Learning Methods And Applications
5	Research On Key Technologies Of Wireless Communication Physical Layer Based On Deep Learning
6	Research On Network Attack Detection Technology Based On Deep Generative Model
7	Research On Agent Decision-making And Control Based On Deep Reinforcement Learning
8	Research On Reinforcement Learning In Agent Model Of Decision Making Simulation System
9	Multi-strategy Collaborative Decision-making System Based On Deep Reinforcement Learning
10	Research On Command Decision Method From RTS Perspective On Deep Learning