The surprising efficiency of temporal difference learning for rare event
prediction
The surprising efficiency of temporal difference learning for rare event
prediction
We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event …