Ask a Question

Prefer a chat interface with context about you and your work?

Empirical Policy Optimization for <i>n</i>-Player Markov Games

Empirical Policy Optimization for <i>n</i>-Player Markov Games

In single-agent Markov decision processes, an agent can optimize its policy based on the interaction with the environment. In multiplayer Markov games (MGs), however, the interaction is nonstationary due to the behaviors of other players, so the agent has no fixed optimization objective. The challenge becomes finding equilibrium policies for …