Markov Strategies Of The Two Algorithms, The Probability Threshold Rule

Posted on:2003-06-06

Degree:Master

Type:Thesis

Country:China

Candidate:F Jiang

Full Text:PDF

GTID:2190360122466672

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

Markov Decision Process, in short MDP, is also called Sequential Stochastic Optimization Stochastic Optimum Control. The Controlled Markov Process or Stochastic Dynamic Programming is the theory on Stochastic sequential decision. The main research objects are transferred Structure Control Stochastic System. According to the condition of the system, a decision maker ( a man or a computer ) should select a way to control or affect the transfer of the system, so that each way decides the aimed function value of the Stochastic Process and the corresponding ones. The purpose of MDP is to select a good control steategy .This article discusses the arithmetic problem of the Optimum Solution under the new principle. This principle is called Probability Threshold value. In order to study the problem of the Optimum of the probability threshold value, we use two ways in Markov to solve the problem. The first is based on the gain of the previous accumulate value, We can get the total gain at the stage of n, and from gained accumulate value of the stochastic variable set, and from the aggregate set of the past value, An enlarged state space is formed if we use the aggregate of the past value in the state space X, in which we use Markov Optimum Strategy. The second is use future threshold value probability. Base on the Markov optimization of the threshold value probability we introduce future threshold value changeable with time serving as a variable under the new state. The threshold value probability is maximized and the optimum strategy of Markov is extracted. To make it clear, the article use and analyze numerical value cases from Bellmum and Zadeh. Finally in order to work out the value function in all forms and the expected value of optimization in all aspects easily, we take down all the problem existing in multi-stage stochastic decision processes. The Solution of Optimum is formed and multi-stage stochastic decision tree-table is introduced.

Keywords/Search Tags:

Markov decision processes, MDP, Markov optimum policy, Probability threshold rule, Multi-stage stochastic decision tree-table

PDF Full Text Request

Related items

1	Continuous-time Markov Decision Processes In Random Environments
2	State Estimation And Policy Learning In Partially Observable Markov Decision Processes
3	Variance Optimization For Continuous-time Markov Decision Processes
4	Multichain Markov Decision Optimization: Theoretical Studies And Applications On The Joint Replacement Problem
5	Markov Systems: Gradient Approximation Approach And Applications To Communications
6	Acceleration of Iterative Methods for Markov Decision Processes
7	Research On Multi-level Hierarchical And Interactive MDP
8	Some Limit Theorems Of Nonhomogeneous Markov Chains And Tree-Indexed Markov Chains
9	Stochastic System Optimization Based On Sensitivity Analysis And Its Application In Financial Engineering
10	Approximation algorithms in stochastic control