Almost Optimal Variance-Constrained Best Arm Identification
Almost Optimal Variance-Constrained Best Arm Identification
We design and analyze Variance-Aware-Lower and Upper Confidence Bound (VA-LUCB), a parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and under a stringent constraint that the variance of the chosen arm is strictly smaller than a given threshold. An upper bound on VA-LUCB's sample complexity is shown …