Abstract
: In this paper, we present a Markov Decision Process (MDP) based formulation for solving the stochastic root-finding problem with predefined stopping time. In the problem that we pose, we need to find only one root of a given finite valued function f (p,u,h). Here, p is a known Markov chain, u is the adjustable variable, and h is the unknown random variable with known distribution. Hence we cannot measure the true value of the function because h is unknown. We assume that we have a way of measuring whether or not f is within some bound ε from zero. We also present a formulation for the problem where f (p,u) is measurable but the adjustment of u is stochastic with known distribution. Another challenge in our problem is the introduction of finite stopping time. This means that the MDP policy has only a predefined finite number of actions available for adjusting u to find the root (or bring | f | < ε). We have included a price control example in the paper to demonstrate the behavior of the resulting MDP policy in response to the available time steps and the variable values of u and p. The results show reasonable trajectories for our simulated environment.

Ali Nasir, Huma Rehman Baig. (2020) Discrete Time Stochastic Root-finding with Forced Stopping Time, , PROC. OF THE PAS: A; 57, Issue 2.
  • Views 217
  • Downloads