Ask a Question

Prefer a chat interface with context about you and your work?

Observational Overfitting in Reinforcement Learning

Observational Overfitting in Reinforcement Learning

A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process (MDP). We provide a general framework for analyzing this scenario, which we use to design multiple synthetic …