Overblog Follow this blog
Edit post Administration Create my blog
June 3 2011 6 03 /06 /June /2011 13:28

This page is dedicated to start discussions about the article "A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences". Feel free to post any comment, sugggestion, question, correction, extension... I will enjoy discussing this with you.

  • Abstract:

"We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic regret matches the lower bound of \cite{Burnetas96}. Our contribution is to provide a finite-time analysis of this algorithm; we get bounds whose main terms are smaller than the ones of previously known algorithms with finite-time analyses (like UCB-type algorithms). "

  • Discussion:

We provided a finite-time analysis of the (asymptotically optimal) Kinf–strategy in the case
of finitely supported distributions. One could think that the extension to the case of general distributions
is straightforward. However this extension appears somewhat difficult (at least when using the current
definition of Kinf ) for the following reasons: (1) adapting the proof would require some extension of Sanov’s non-asymptotic Theorem. (2) some needed properties in the proof, do not seem to be always satisfied for general distributions.

  • Future work:

Extend this setting to general exponential families; for instance, histogram-based approximations of general distributions could be considered.

Share this post

Repost 0
Published by Odalric-Ambrym Maillard - in Discussing articles
write a comment





Senior Researcher



The Technion
Faculty of Electrical Engineering,
Fishbach Bldg, Rm. 462
32000 Haifa, ISRAEL
Tel: +972 (0)48293638