Ask AI a math question

Related Paper

Empirical Policy Optimization for <i>n</i>-Player Markov Games

In single-agent Markov decision processes, an agent can optimize its policy based on the interaction with the environment. In multiplayer Markov games (MGs), however, the interaction is nonstationary due to the behaviors of other players, so the agent has no fixed optimization objective. The challenge becomes finding equilibrium policies for …

Ask a Question