Constrained Latent Action Policies for Model-Based Offline Reinforcement
Learning
Constrained Latent Action Policies for Model-Based Offline Reinforcement
Learning
In offline reinforcement learning, a policy is learned using a static dataset in the absence of costly feedback from the environment. In contrast to the online setting, only using static datasets poses additional challenges, such as policies generating out-of-distribution samples. Model-based offline reinforcement learning methods try to overcome these by …