Ask a Question

Prefer a chat interface with context about you and your work?

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

This paper unifies the design and the analysis of risk-averse Thompson sampling algorithms for the multi-armed bandit problem for a class of risk functionals ρ that are continuous and dominant. We prove generalised concentration bounds for these continuous and dominant risk functionals and show that a wide class of popular …