Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift

Type: Preprint

Publication Date: 2019-01-27

Citations: 7

View

Locations

  • arXiv (Cornell University) - View

Similar Works

Action Title Year Authors
+ Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift 2019 Carles Gelada
Marc G. Bellemare
+ PDF Chat Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift 2019 Carles Gelada
Marc G. Bellemare
+ Adaptive Trade-Offs in Off-Policy Learning 2019 Mark Rowland
Will Dabney
RĂ©mi Munos
+ Adaptive Trade-Offs in Off-Policy Learning 2019 Mark Rowland
Will Dabney
RĂ©mi Munos
+ Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning 2023 Melrose Roderick
Gaurav Manek
Felix Berkenkamp
J. Zico Kolter
+ Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning 2023 Akash Velu
Skanda Vaidyanath
Dilip Arumugam
+ PDF Chat CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning 2024 Zeyuan Liu
Kai Yang
Xiu Li
+ Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach 2022 Baturay Sağlam
Dogan C. Cicek
Furkan B. Mutlu
SĂŒleyman S. Kozat
+ Handling Cost and Constraints with Off-Policy Deep Reinforcement Learning 2023 J. Markowitz
Jesse L. Silverberg
Gary S. Collins
+ Return-based Scaling: Yet Another Normalisation Trick for Deep RL 2021 Tom Schaul
Georg Ostrovski
Iurii Kemaev
Diana Borsa
+ PDF Chat Divergence-Augmented Policy Optimization 2025 Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
+ Value-aware Importance Weighting for Off-policy Reinforcement Learning 2023 Kristopher De Asis
Eric Graves
Richard S. Sutton
+ Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 2017 Shixiang Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard E. Turner
Bernhard Schölkopf
Sergey Levine
+ Generalized Proximal Policy Optimization with Sample Reuse 2021 James Queeney
Ioannis Ch. Paschalidis
Christos G. Cassandras
+ Generalized Proximal Policy Optimization with Sample Reuse 2021 James Queeney
Ioannis Ch. Paschalidis
Christos G. Cassandras
+ Conservative Q-Learning for Offline Reinforcement Learning 2020 Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
+ Conservative Q-Learning for Offline Reinforcement Learning 2020 Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
+ Combining policy gradient and Q-learning 2016 Brendan O’Donoghue
RĂ©mi Munos
Koray Kavukcuoglu
Volodymyr Mnih
+ Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks 2023 Ryan Sullivan
Akarsh Kumar
Sheng‐Yi Huang
John P. Dickerson
Joseph SuĂĄrez
+ DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction 2020 Aviral Kumar
Abhishek Gupta
Sergey Levine