Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning
Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning

 
 
 Many goal-reaching reinforcement learning (RL) tasks have empirically verified that rewarding the agent on subgoals improves convergence speed and practical performance. We attempt to provide a theoretical framework to quantify the computational benefits of rewarding the completion of subgoals, in terms of the number of synchronous value …