Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Value function is the central notion of Reinforcement Learning (RL). Value estimation, especially with function approximation, can be challenging since it involves the stochasticity of environmental dynamics and reward signals that can be sparse and delayed in some cases. A typical model-free RL algorithm usually estimates the values of a …