Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

Type: Preprint

Publication Date: 2022-01-01

Citations: 1

DOI: https://doi.org/10.48550/arxiv.2205.13451

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback 2023 Haolin Liu
Chen-Yu Wei
Julian Zimmert
+ PDF Chat Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback 2024 Haolin Liu
Zakaria Mhammedi
Chen-Yu Wei
Julian Zimmert
+ Online Markov Decision Processes with Aggregate Bandit Feedback 2021 Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
+ Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition 2020 Liyu Chen
Haipeng Luo
Chen-Yu Wei
+ Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition 2021 Liyu Chen
Haipeng Luo
Chen-Yu Wei
+ Learning Adversarial MDPs with Bandit Feedback and Unknown Transition 2019 Chi Jin
Tiancheng Jin
Haipeng Luo
Suvrit Sra
Tiancheng Yu
+ Learning Adversarial Markov Decision Processes with Delayed Feedback 2020 Tal Lancewicki
Aviv Rosenberg
Yishay Mansour
+ PDF Chat Learning Adversarial Markov Decision Processes with Delayed Feedback 2022 Tal Lancewicki
Aviv Rosenberg
Yishay Mansour
+ Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition 2020 Tiancheng Jin
Haipeng Luo
+ PDF Chat The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition 2021 Tiancheng Jin
Longbo Huang
Haipeng Luo
+ The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition 2021 Tiancheng Jin
Longbo Huang
Haipeng Luo
+ PDF Chat Best-of-Both-Worlds Policy Optimization for CMDPs with Bandit Feedback 2024 Francesco Emanuele Stradi
Anna Lunghi
Matteo Castiglioni
Alberto Marchesi
Nicola Gatti
+ Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback 2022 Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv Rosenberg
+ Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation 2020 Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Rahul Jain
+ PDF Chat Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation 2020 Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Rahul Jain
+ Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback 2023 Tal Lancewicki
Aviv Rosenberg
Dmitry Sotnikov
+ PDF Chat Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition 2024 Longfei Li
Peng Zhao
Zhihua Zhou
+ PDF Chat Refined Analysis of FPL for Adversarial Markov Decision Processes 2020 Yuanhao Wang
Kefan Dong
+ Refined Analysis of FPL for Adversarial Markov Decision Processes 2020 Yuanhao Wang
Kefan Dong
+ Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition Case 2021 Liyu Chen
Haipeng Luo

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors