Ask a Question

Prefer a chat interface with context about you and your work?

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved much higher sample efficiency than previous model-free methods for continuous-action DRL benchmarks. In this paper, we introduce a simple model-free algorithm, Randomized Ensembled Double Q-Learning (REDQ), and show that its performance is just as good as, if not better …