loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Thomas Norheim 1 ; Terje Brådland 1 ; Ole-Christoffer Granmo 1 and B. John Oommen 2

Affiliations: 1 University of Agder, Norway ; 2 Carleton University, Canada

ISBN: 978-989-674-021-4

Keyword(s): Bandit problems, Conjugate priors, Sampling, Bayesian learning.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Computational Intelligence ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Soft Computing ; Symbolic Systems ; Uncertainty in AI

Abstract: The Multi-Armed Bernoulli Bandit (MABB) problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Although poised in an abstract framework, the applications of the MABB are numerous (Gelly and Wang, 2006; Kocsis and Szepesvari, 2006; Granmo et al., 2007; Granmo and Bouhmala, 2007) . On the other hand, while Bayesian methods are generally computationally intractable, they have been shown to provide a standard for optimal decision making. This paper proposes a novel MABB solution scheme that is inherently Bayesian in nature, and which yet avoids the computational intractability by relying simply on updating the hyper-parameters of the sibling conjugate distributions, and on simultaneously sampling randomly from the respecti ve posteriors. Although, in principle, our solution is generic, to be concise, we present here the strategy for Bernoulli distributed rewards. Extensive experiments demonstrate that our scheme outperforms recently proposed bandit playing algorithms. We thus believe that our methodology opens avenues for obtaining improved novel solutions. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.233.226.151

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Norheim T.; Brådland T.; Granmo O.; John Oommen B. and (2010). A GENERIC SOLUTION TO MULTI-ARMED BERNOULLI BANDIT PROBLEMS BASED ON RANDOM SAMPLING FROM SIBLING CONJUGATE PRIORS.In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-674-021-4, pages 36-44. DOI: 10.5220/0002712500360044

@conference{icaart10,
author={Thomas Norheim and Terje Brådland and Ole{-}Christoffer Granmo and B. {John Oommen}},
title={A GENERIC SOLUTION TO MULTI-ARMED BERNOULLI BANDIT PROBLEMS BASED ON RANDOM SAMPLING FROM SIBLING CONJUGATE PRIORS},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2010},
pages={36-44},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002712500360044},
isbn={978-989-674-021-4},
}

TY - CONF

JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A GENERIC SOLUTION TO MULTI-ARMED BERNOULLI BANDIT PROBLEMS BASED ON RANDOM SAMPLING FROM SIBLING CONJUGATE PRIORS
SN - 978-989-674-021-4
AU - Norheim, T.
AU - Brådland, T.
AU - Granmo, O.
AU - John Oommen, B.
PY - 2010
SP - 36
EP - 44
DO - 10.5220/0002712500360044

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.