By applying the transformation of the SMP to the discrete-time Markov chain(DTMC),the potential of the DTMC is used to obtain the sensitivity formula and optimality equation of SMP.

Based on performance potential theorem and Bellman optimality equation, it is easy to establish optimality equation, which we call performance potential-based Bellman optimality equation, for both average-cost and discounted-cost performance criteria.

Moreover,the relationship of sensitivity formulas ,as well as that of the optimality equations,under the discounted cost criteria and the average cost criteria and the average cost criteria is established by using the vanishing discounted factor.

Based on optimal Poisson equation and the optimal theorem with potentials, a lot of algorithms, such as policy iteration and value iteration, can be obtained.

In this stochastic stopping model, we prove that there exists an optimal deterministic and stationary policy and the optimality equation has a unique solution.

It is shown that both value functions satisfy the optimality equation and upper and lower bounds as well as conditions for equality for these functions are presented.

Under a Lyapunov function condition, we show that stationary policies obtained from the average reward optimality equation are not only average reward optimal, but indeed sample path average reward optimal, for almost all sample paths.

For the case of the switching arms, only of one which creates rewards, we solve explicitly the average optimality equation and prove that a myopic policy is average optimal.

We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation.

An analogy between the optimality equations and the governing equations for a set of certain static beams permits obtaining numerical solutions to the optimal control problem with the help of standard 'structural' FEM software.

From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion.

Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions

The approach uses an analogy between the optimality equations for control in the time domain and the governing equations for a set of static beams in the spatial domain.

Moreover,necessary andsufficient conditions are given so that the optimality equations have a bounded solution with an additional property.

In this stochastic stopping model, we prove that there exists an optimal deterministic and stationary policy and the optimality equation has a unique solution.

It is shown that both value functions satisfy the optimality equation and upper and lower bounds as well as conditions for equality for these functions are presented.

Under a Lyapunov function condition, we show that stationary policies obtained from the average reward optimality equation are not only average reward optimal, but indeed sample path average reward optimal, for almost all sample paths.

For the case of the switching arms, only of one which creates rewards, we solve explicitly the average optimality equation and prove that a myopic policy is average optimal.

We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation.

This paper deals with the performance optimization problem of a class of controlled closed queueing network systems (CQNS). We introduce two fundamental concepts: the discounted cost α-performance potentials and average cost performance potentials, and consider a fundamental relation between the two potentials. Under a general assumption, we establish directly the optimality equation for infinite time horizon average cost model and prove the existence of optimal solution in a compact action set by using properties...

This paper deals with the performance optimization problem of a class of controlled closed queueing network systems (CQNS). We introduce two fundamental concepts: the discounted cost α-performance potentials and average cost performance potentials, and consider a fundamental relation between the two potentials. Under a general assumption, we establish directly the optimality equation for infinite time horizon average cost model and prove the existence of optimal solution in a compact action set by using properties of the performance potentials, suggest an policy_optimality algorithm and give a numerical example to illustrate the application of the proposed algorithm.

This paper deals with the average cost optimization problem for a class of discrete time Markov control processes. Under quite general assumptions, the optimality equation is directly established and the existence theorem of optimal solution is proved for infinite time average cost model in a compact action set by using basic properties of the Markov performance potentials. The iterate algorithm for solving optimal stationary control strategy is suggested and the convergence problem of this algorithm is discussed....

This paper deals with the average cost optimization problem for a class of discrete time Markov control processes. Under quite general assumptions, the optimality equation is directly established and the existence theorem of optimal solution is proved for infinite time average cost model in a compact action set by using basic properties of the Markov performance potentials. The iterate algorithm for solving optimal stationary control strategy is suggested and the convergence problem of this algorithm is discussed. Finally, a numerical example is analyzed to illustrate the application of the proposed algorithm.

Optimization algorithms are studied for a class of continuous-time Markov control processes (CTMCPs) with infinite horizon average-cost criteria and compact action set. By using the formula of performance potentials and an average-cost optimality equation for CTMCPs, a policy iteration algorithm and a value iteration algorithm are derived, which can lead to an optimal or suboptimal stationary policy in a finite number of iterations. The convergence of these algorithms is established, without the assumption...

Optimization algorithms are studied for a class of continuous-time Markov control processes (CTMCPs) with infinite horizon average-cost criteria and compact action set. By using the formula of performance potentials and an average-cost optimality equation for CTMCPs, a policy iteration algorithm and a value iteration algorithm are derived, which can lead to an optimal or suboptimal stationary policy in a finite number of iterations. The convergence of these algorithms is established, without the assumption of the corresponding iteration operator being an sp-contraction. A numerical example of queuing networks shows advantages of the proposed value iteration method.