Ask AI a math question

Related Paper

Game of Thrones: Fully Distributed Learning for Multiplayer Bandits

We consider an N-player multi-armed bandit game where each player chooses one out of M arms for T turns. Each player has different expected rewards for the arms, and the instantaneous rewards are independent and identically distributed or Markovian. When two or more players choose the same arm, they all …

Ask a Question