Academia.eduAcademia.edu

Efficient selection of multiple bandit arms: Theory and practice

2010

Abstract

Abstract We consider the general, widely applicable problem of selecting from n real-valued random variables a subset of size m of those with the highest means, based on as few samples as possible. This problem, which we denote Explore-m, is a core aspect in several stochastic optimization algorithms, and applications of simulation and industrial engineering.