Ask AI a math question

Related Paper

Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits

We propose a {\em novel} piecewise stationary linear bandit (PSLB) model, where the environment randomly samples a context from an unknown probability distribution at each changepoint, and the quality of an arm is measured by its return averaged over all contexts. The contexts and their distribution, as well as the …

Ask a Question