Ask a Question

Prefer a chat interface with context about you and your work?

Game of Thrones: Fully Distributed Learning for Multiplayer Bandits

Game of Thrones: Fully Distributed Learning for Multiplayer Bandits

We consider an N-player multi-armed bandit game where each player chooses one out of M arms for T turns. Each player has different expected rewards for the arms, and the instantaneous rewards are independent and identically distributed or Markovian. When two or more players choose the same arm, they all …