Heuristic Learning Model Based On Partially Observable Markov Decision Process

Posted on:2022-04-30

Degree:Master

Type:Thesis

Country:China

Candidate:J Luo

Full Text:PDF

GTID:2518306557470634

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of knowledge and technology,how to master them to fulfill the related tasks as soon as possible has been a significant research topic.Consider that individuals have different capability,and high priority can be provided to some objects.The key challenges are to improve the quality and efficiency of learning via a heuristic way,while reducing the unnecessary time consumption and costs,which have become the issues of common concern.In order to address the limitation of traditional the heuristics learning and optimization of the learning resource allocation,we propose the heuristic learning model based on partially observable Markov decision process(HL-POMDP),as an advanced version of the uniform sampling and greedy strategy learning methods.The propose HL-POMDP method utilizes the exponentially weighted moving average to compare the aggregated learning effects to dynamically allocate the resource of the learning.Furthermore,the resource can be optimized by using the stop condition for learning.Therefore,while guaranteeing the better learning quality of the high-priority users,HL-POMDP can improve the entire learning efficiency of all users including the high-priority users.Finally,the LSTM neural networks for multiple digit decimal adder are used to simulate the users with different capacities.The massive experiments have validated the effectiveness of HL-POMDP,which is superior to the uniform sampling and greedy strategy learning methods in terms of learning quality and accuracy.

Keywords/Search Tags:

reinforcement learning, Partially observable Markov decision process, priority, Guided learning

PDF Full Text Request

Related items

1	Deep Value Iteration Network For Partially Observable Markov Decision Process
2	Research On Optimization Of Service Composition Based On Partially Observable Environment
3	Deep Reinforcement Learning For Partially Observability
4	The Reinforcement Learning Research Based On Internal State In Partially Observable Markov Decision Processes
5	Learning partially observable Markov decision processes using abstract actions
6	Hierarchical learning and planning in partially observable Markov decision processes
7	Model learning and application of partially observable Markov decision processes
8	Research And Implementation Of Strategy Effectiveness Guarantee Mechanism For Self-Adaptive Software
9	Increasing scalability in algorithms for centralized and decentralized partially observable Markov decision processes: Efficient decision-making and coordination in uncertain environments
10	Markov Theory Based Planning And Sensing Under Uncertainty