Projects
Reading
People
Chat
SU\G
(đž)
/K·U
Projects
Reading
People
Chat
Sign Up
Light
Dark
System
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Carles Gelada
,
Marc G. Bellemare
Type:
Preprint
Publication Date:
2019-01-27
Citations:
7
View
Share
Locations
arXiv (Cornell University) -
View
Similar Works
Action
Title
Year
Authors
+
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
2019
Carles Gelada
Marc G. Bellemare
+
PDF
Chat
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
2019
Carles Gelada
Marc G. Bellemare
+
Adaptive Trade-Offs in Off-Policy Learning
2019
Mark Rowland
Will Dabney
RĂ©mi Munos
+
Adaptive Trade-Offs in Off-Policy Learning
2019
Mark Rowland
Will Dabney
RĂ©mi Munos
+
Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning
2023
Melrose Roderick
Gaurav Manek
Felix Berkenkamp
J. Zico Kolter
+
Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning
2023
Akash Velu
Skanda Vaidyanath
Dilip Arumugam
+
PDF
Chat
CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning
2024
Zeyuan Liu
Kai Yang
Xiu Li
+
Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach
2022
Baturay SaÄlam
Dogan C. Cicek
Furkan B. Mutlu
SĂŒleyman S. Kozat
+
Handling Cost and Constraints with Off-Policy Deep Reinforcement Learning
2023
J. Markowitz
Jesse L. Silverberg
Gary S. Collins
+
Return-based Scaling: Yet Another Normalisation Trick for Deep RL
2021
Tom Schaul
Georg Ostrovski
Iurii Kemaev
Diana Borsa
+
PDF
Chat
Divergence-Augmented Policy Optimization
2025
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
+
Value-aware Importance Weighting for Off-policy Reinforcement Learning
2023
Kristopher De Asis
Eric Graves
Richard S. Sutton
+
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
2017
Shixiang Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard E. Turner
Bernhard Schölkopf
Sergey Levine
+
Generalized Proximal Policy Optimization with Sample Reuse
2021
James Queeney
Ioannis Ch. Paschalidis
Christos G. Cassandras
+
Generalized Proximal Policy Optimization with Sample Reuse
2021
James Queeney
Ioannis Ch. Paschalidis
Christos G. Cassandras
+
Conservative Q-Learning for Offline Reinforcement Learning
2020
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
+
Conservative Q-Learning for Offline Reinforcement Learning
2020
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
+
Combining policy gradient and Q-learning
2016
Brendan OâDonoghue
RĂ©mi Munos
Koray Kavukcuoglu
Volodymyr Mnih
+
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks
2023
Ryan Sullivan
Akarsh Kumar
ShengâYi Huang
John P. Dickerson
Joseph SuĂĄrez
+
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
2020
Aviral Kumar
Abhishek Gupta
Sergey Levine
Cited by (6)
Action
Title
Year
Authors
+
Overfitting and Optimization in Offline Policy Learning.
2020
David Brandfonbrener
WILLIAM F. WHITNEY
Rajesh Ranganath
Joan Bruna
+
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
2019
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
+
Sublinear Optimal Policy Value Estimation in Contextual Bandits
2019
Weihao Kong
Gregory Valiant
Emma Brunskill
+
Human-centric Dialog Training via Offline Reinforcement Learning
2020
Natasha Jaques
Judy Hanwen Shen
Asma Ghandeharioun
Craig Ferguson
Ăgata Lapedriza
N. S. Jones
Shixiang Gu
Rosalind W. Picard
+
Off-Policy Policy Gradient Algorithms by Constraining the State Distribution Shift.
2019
Riashat Islam
Komal K. Teru
Deepak Kumar Sharma
+
PDF
Chat
Forward and Backward Bellman Equations Improve the Efficiency of the EM Algorithm for DEC-POMDP
2021
Takehiro Tottori
Tetsuya J. Kobayashi
Citing (4)
Action
Title
Year
Authors
+
Deep reinforcement learning with double Q-Learning
2016
Hado van Hasselt
Arthur Guez
David Silver
+
Reinforcement Learning with Unsupervised Auxiliary Tasks
2016
Max Jaderberg
Volodymyr Mnih
Wojciech Marian Czarnecki
Tom Schaul
Joel Z. Leibo
David Silver
Koray Kavukcuoglu
+
Prioritized Experience Replay
2015
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
+
PDF
Chat
The Arcade Learning Environment: An Evaluation Platform for General Agents
2013
Marc G. Bellemare
Yavar Naddaf
Joel Veness
Michael Bowling