loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Thomas Norheim 1 ; Terje Brådland 1 ; Ole-Christoffer Granmo 1 and B. John Oommen 2

Affiliations: 1 University of Agder, Norway ; 2 Carleton University, Canada

Keyword(s): Bandit problems, Conjugate priors, Sampling, Bayesian learning.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Computational Intelligence ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Soft Computing ; Symbolic Systems ; Uncertainty in AI

Abstract: The Multi-Armed Bernoulli Bandit (MABB) problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Although poised in an abstract framework, the applications of the MABB are numerous (Gelly and Wang, 2006; Kocsis and Szepesvari, 2006; Granmo et al., 2007; Granmo and Bouhmala, 2007) . On the other hand, while Bayesian methods are generally computationally intractable, they have been shown to provide a standard for optimal decision making. This paper proposes a novel MABB solution scheme that is inherently Bayesian in nature, and which yet avoids the computational intractability by relying simply on updating the hyper-parameters of the sibling conjugate distributions, and on simultaneously sampling randomly from the respect ive posteriors. Although, in principle, our solution is generic, to be concise, we present here the strategy for Bernoulli distributed rewards. Extensive experiments demonstrate that our scheme outperforms recently proposed bandit playing algorithms. We thus believe that our methodology opens avenues for obtaining improved novel solutions. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 52.55.55.239

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Norheim, T.; Brådland, T.; Granmo, O. and John Oommen, B. (2010). A GENERIC SOLUTION TO MULTI-ARMED BERNOULLI BANDIT PROBLEMS BASED ON RANDOM SAMPLING FROM SIBLING CONJUGATE PRIORS. In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART; ISBN 978-989-674-021-4; ISSN 2184-433X, SciTePress, pages 36-44. DOI: 10.5220/0002712500360044

@conference{icaart10,
author={Thomas Norheim. and Terje Brådland. and Ole{-}Christoffer Granmo. and B. {John Oommen}.},
title={A GENERIC SOLUTION TO MULTI-ARMED BERNOULLI BANDIT PROBLEMS BASED ON RANDOM SAMPLING FROM SIBLING CONJUGATE PRIORS},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART},
year={2010},
pages={36-44},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002712500360044},
isbn={978-989-674-021-4},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART
TI - A GENERIC SOLUTION TO MULTI-ARMED BERNOULLI BANDIT PROBLEMS BASED ON RANDOM SAMPLING FROM SIBLING CONJUGATE PRIORS
SN - 978-989-674-021-4
IS - 2184-433X
AU - Norheim, T.
AU - Brådland, T.
AU - Granmo, O.
AU - John Oommen, B.
PY - 2010
SP - 36
EP - 44
DO - 10.5220/0002712500360044
PB - SciTePress