Settling the Bias and Variance of Meta-Gradient Estimation for
Meta-Reinforcement Learning
Settling the Bias and Variance of Meta-Gradient Estimation for
Meta-Reinforcement Learning
In recent years, gradient based Meta-RL (GMRL) methods have achieved remarkable successes in either discovering effective online hyperparameter for one single task (Xu et al., 2018) or learning good initialisation for multi-task transfer learning (Finn et al., 2017). Despite the empirical successes, it is often neglected that computing meta gradients …