MODULI: Unlocking Preference Generalization via Diffusion Models for
Offline Multi-Objective Reinforcement Learning
MODULI: Unlocking Preference Generalization via Diffusion Models for
Offline Multi-Objective Reinforcement Learning
Multi-objective Reinforcement Learning (MORL) seeks to develop policies that simultaneously optimize multiple conflicting objectives, but it requires extensive online interactions. Offline MORL provides a promising solution by training on pre-collected datasets to generalize to any preference upon deployment. However, real-world offline datasets are often conservatively and narrowly distributed, failing to …