Font Size: a A A

Research On The Training Method Of Ethical Agents With The Ability To Solve Emergencies

Posted on:2024-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2555307157983479Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
The application of intelligent agents in our daily lives is becoming increasingly widespread,so the ethical issues of intelligent agents have also received increasing attention.During the process of completing the main task,the intelligent agent may encounter situations unrelated to the main task,such as encountering someone who has fallen or having garbage on the ground.Therefore,an ethical intelligent agent not only needs to be able to correctly complete its main tasks,but also needs to solve problems in a way that is in line with human values when facing these situations.This requires intelligent agents to have certain moral judgment and behavioral norms,and to be able to flexibly respond in different situations.This "moral intelligence" is conducive to the harmonious coexistence between intelligent agents and humans,and helps to improve the application value and social recognition of intelligent agents.Given the advantages of reinforcement learning mainly through continuous interaction with the environment to learn behavioral decisions without requiring a large amount of manual annotation,as well as the ideal performance of inverse reinforcement learning in learning human expert behavior trajectories and solving problems that require manual setting of reward functions,this paper proposes a novel multi-objective ethical agent training method,and completes experimental verification of the agent based on the proposed method.This article mainly conducts research on the following work:(1)A training framework for ethical AI agents with multi-objective capabilities has been proposed,which uses a hybrid algorithm to train ethical AI agents with the ability to solve emergent situations.This framework enables the intelligent agent to accomplish secondary goals in addition to its primary objective,that is,to follow human ethical values.Furthermore,this framework is flexible and scalable,allowing for the easy addition of new objectives to expand the agent’s functionality.The reinforcement learning algorithm has been employed using both reinforcement learning and reward shaping techniques,to enable the intelligent agent to learn the optimal path for the main task as well as fixed position solutions for emergent situations.In addition,inverse reinforcement learning algorithm has been used to teach the intelligent agent to deal with non-fixed position emergent situations.The use of reward shaping allows the ethical AI agent to acquire fast incremental learning capabilities,making it capable of adapting to changes in the environment and rapidly correcting its behavior using previously learned knowledge.(2)The PS_LinUCB(Policy Selection LinUCB)algorithm has been proposed to address the problem of policy selection for intelligent agents.This algorithm converts the policy selection problem into a contextual multi-armed bandit problem and solves it using the PS_LinUCB algorithm.With PS_LinUCB,the algorithm chooses an arm based on the current environmental context,and each arm represents a trained policy rather than an atomic action.Then,the algorithm selects the best action based on the selected trained policy.In this way,the intelligent agent learns how to select actions based on the contextual environmental features.(3)The effectiveness of the proposed ethical AI agent training method is validated by borrowing from the smart healthcare scenario.First,the basic action space for training the intelligent agent was constructed based on the designed environmental space and episode graph.Then,to train the intelligent agent’s ability to deal with emergent situations,additional nodes were added to the episode graph and the basic action space was expanded.Finally,both the environmental and behavioral settings were mapped to a simulated grid environment for ethical AI agent training.Experimental results showed that the proposed ethical AI agent training method can significantly and correctly select behaviors based on human ethical rules,demonstrating that the proposed method is feasible and effective.
Keywords/Search Tags:ethical agent, ethical design, reinforcement learning, inverse reinforcement learning, context multi-armed bandit
PDF Full Text Request
Related items