Ask a Question

Prefer a chat interface with context about you and your work?

EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning

EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning

Posterior Sampling for Reinforcement Learning (PSRL) is a well-known algorithm that augments model-based reinforcement learning (MBRL) algorithms with Thompson sampling. PSRL maintains posterior distributions of the environment transition dynamics and the reward function, which are intractable for tasks with high-dimensional state and action spaces. Recent works show that dropout, used …