Optimal adaptive policies for markov decision processes


Burnetas, A.N. & Katehakis, M.N., 1997. Optimal adaptive policies for markov decision processes. Mathematics of Operations Research, 22, pp.222-255.


In this paper we consider the problem of adaptive control for Markov Decision Processes. We give the explicit form for a class of adaptive policies that possess optimal increase rate properties for the total expected finite horizon reward, under sufficient assumptions of finite state-action spaces and irreducibility of the transition law. A main feature of the proposed policies is that the choice of actions, at each state and time period, is based on indices that are inflations of the right-hand side of the estimated average reward optimality equations.


cited By 42
