Ask a Question

Prefer a chat interface with context about you and your work?

Online Markov Decision Processes With Kullback–Leibler Control Cost

Online Markov Decision Processes With Kullback–Leibler Control Cost

This paper considers an online (real-time) control problem that involves an agent performing a discrete-time random walk over a finite state space. The agent's action at each time step is to specify the probability distribution for the next state given the current state. Following the setup of Todorov, the state-action …