Distributional Reinforcement Learning With Quantile Regression
Distributional Reinforcement Learning With Quantile Regression
In reinforcement learning (RL), an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the observed long-term return. Traditionally, reinforcement learning algorithms average over this randomness to estimate the value …