Ask a Question

Prefer a chat interface with context about you and your work?

On the Linear Convergence of Natural Policy Gradient Algorithm

On the Linear Convergence of Natural Policy Gradient Algorithm

Markov Decision Processes are classically solved using Value Iteration and Policy Iteration algorithms. Recent interest in Reinforcement Learning has motivated the study of methods inspired by optimization, such as gradient ascent. Among these, a popular algorithm is the Natural Policy Gradient, which is a mirror descent variant for MDPs. This …