loading
Documents

Research.Publish.Connect.

Paper

Authors: Miguel Martin ; Antonio Jiménez-Martín and Alfonso Mateos

Affiliation: Universidad Politécnica de Madrid, Spain

ISBN: 978-989-758-218-9

Keyword(s): Multi-armed Bandit Problem, Possibilistic Reward, Numerical Study.

Related Ontology Subjects/Areas/Topics: Decision Analysis ; Methodologies and Technologies ; Operational Research ; Stochastic Processes

Abstract: Different allocation strategies can be found in the literature to deal with the multi-armed bandit problem under a frequentist view or from a Bayesian perspective. In this paper, we propose a novel allocation strategy, the possibilistic reward method. First, possibilistic reward distributions are used to model the uncertainty about the arm expected rewards, which are then converted into probability distributions using a pignistic probability transformation. Finally, a simulation experiment is carried out to find out the one with the highest expected reward, which is then pulled. A parametric probability transformation of the proposed is then introduced together with a dynamic optimization, which implies that neither previous knowledge nor a simulation of the arm distributions is required. A numerical study proves that the proposed method outperforms other policies in the literature in five scenarios: a Bernoulli distribution with very low success probabilities, with success probabilit ies close to 0.5 and with success probabilities close to 0.5 and Gaussian rewards; and truncated in [0,10] Poisson and exponential distributions. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.174.43.27

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Martin, M.; Jiménez-Martín, A. and Mateos, A. (2017). The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study.In Proceedings of the 6th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES, ISBN 978-989-758-218-9, pages 75-84. DOI: 10.5220/0006186400750084

@conference{icores17,
author={Miguel Martin. and Antonio Jiménez{-}Martín. and Alfonso Mateos.},
title={The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study},
booktitle={Proceedings of the 6th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES,},
year={2017},
pages={75-84},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006186400750084},
isbn={978-989-758-218-9},
}

TY - CONF

JO - Proceedings of the 6th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES,
TI - The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study
SN - 978-989-758-218-9
AU - Martin, M.
AU - Jiménez-Martín, A.
AU - Mateos, A.
PY - 2017
SP - 75
EP - 84
DO - 10.5220/0006186400750084

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.