Author: Arnaud Autef
<aside> ðŸ’¡ In this presentation, we review a bandit learning paper from Daniel Russo: Simple Bayesian Algorithms for Best Arm Identification
</aside>
Sequential decision making problem.
$k$ possible designs.
Each design $i \in \{1,~...,~k\}$ has unknown parameter $\theta_i^*$
At each time-step $n \in \mathbb{N}$
<aside> ðŸ’¡ Identify which design $I^*$ leads to the best outcomes, in as few steps as possible.
</aside>
Experimenter = Decision-makers at a pharmaceutical company
$k$ designs = $k$ variants of a treatment to cure a disease
Optimal design $I^*$ = treatment variant with highest efficacy