On Instrumental Variable Regression for Deep Offline Policy Evaluation

Type: Preprint

Publication Date: 2021-01-01

Citations: 2

DOI: https://doi.org/10.48550/arxiv.2105.10148

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Conservative Q-Learning for Offline Reinforcement Learning 2020 Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
+ Conservative Q-Learning for Offline Reinforcement Learning 2020 Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
+ Q-Value Weighted Regression: Reinforcement Learning with Limited Data 2021 Piotr Kozakowski
Ɓukasz Kaiser
Henryk Michalewski
Afroz Mohiuddin
Katarzyna KaƄska
+ IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies 2023 Philippe Hansen-Estruch
Ilya Kostrikov
Michael JĂ€nner
Jakub Grudzien Kuba
Sergey Levine
+ PDF Chat Q-Value Weighted Regression: Reinforcement Learning with Limited Data 2022 Piotr Kozakowski
Ɓukasz Kaiser
Henryk Michalewski
Afroz Mohiuddin
Katarzyna KaƄska
+ On Finite-Sample Analysis of Offline Reinforcement Learning with Deep ReLU Networks. 2021 Thanh Nguyen-Tang
Sunil Gupta
Hung Tran-The
Svetha Venkatesh
+ Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning 2021 Luofeng Liao
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
Mladen Kolar
Zhaoran Wang
+ PDF Chat Orthogonalized Estimation of Difference of $Q$-functions 2024 Angela Zhou
+ Revisiting Bellman Errors for Offline Model Selection 2023 Joshua P. Zitovsky
Daniel de Marchi
Rishabh Agarwal
Michael R. Kosorok
+ Offline Reinforcement Learning with Implicit Q-Learning 2021 Ilya Kostrikov
Ashvin Nair
Sergey Levine
+ Confidence-Conditioned Value Functions for Offline Reinforcement Learning 2022 Joey Hong
Aviral Kumar
Sergey Levine
+ PDF Chat Is Value Learning Really the Main Bottleneck in Offline RL? 2024 Seohong Park
Kevin Frans
Sergey Levine
Aviral Kumar
+ PDF Chat POPO: Pessimistic Offline Policy Optimization 2022 Qiang He
Xinwen Hou
Yu Liu
+ PDF Chat Learning Decision Policies with Instrumental Variables through Double Machine Learning 2024 Daqian Shao
Ashkan Soleymani
Francesco Quinzan
Marta Kwiatkowska
+ Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 2017 Shixiang Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard E. Turner
Bernhard Schölkopf
Sergey Levine
+ Learning Bellman Complete Representations for Offline Policy Evaluation 2022 Jonathan Chang
Kaiwen Wang
Nathan Kallus
Wen Sun
+ An Instrumental Variable Approach to Confounded Off-Policy Evaluation 2022 Xu Yang
Zhu Jin
Chengchun Shi
Shikai Luo
Rui Song
+ On Multi-objective Policy Optimization as a Tool for Reinforcement Learning. 2021 Abbas Abdolmaleki
Sandy H. Huang
Giulia Vezzani
Bobak Shahriari
Jost Tobias Springenberg
Shruti Mishra
Dhruva Tb
Arunkumar Byravan
Konstantinos Bousmalis
Andrås György
+ PDF Chat Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL 2024 Yu Luo
Tianying Ji
Fuchun Sun
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
+ Imitation-Regularized Offline Learning 2019 Yifei Ma
Yu-Xiang Wang
Balakrishnan
Narayanaswamy