loading
  • Login
  • Sign-Up

Research.Publish.Connect.

Paper

Authors: Miguel Martin ; Antonio Jiménez-Martín and Alfonso Mateos

Affiliation: Universidad Politécnica de Madrid, Spain

ISBN: 978-989-758-218-9

Keyword(s): Multi-armed Bandit Problem, Possibilistic Reward, Numerical Study.

Related Ontology Subjects/Areas/Topics: Decision Analysis ; Methodologies and Technologies ; Operational Research ; Stochastic Processes

Abstract: Different allocation strategies can be found in the literature to deal with the multi-armed bandit problem under a frequentist view or from a Bayesian perspective. In this paper, we propose a novel allocation strategy, the possibilistic reward method. First, possibilistic reward distributions are used to model the uncertainty about the arm expected rewards, which are then converted into probability distributions using a pignistic probability transformation. Finally, a simulation experiment is carried out to find out the one with the highest expected reward, which is then pulled. A parametric probability transformation of the proposed is then introduced together with a dynamic optimization, which implies that neither previous knowledge nor a simulation of the arm distributions is required. A numerical study proves that the proposed method outperforms other policies in the literature in five scenarios: a Bernoulli distribution with very low success probabilities, with success probabilit ies close to 0.5 and with success probabilities close to 0.5 and Gaussian rewards; and truncated in [0,10] Poisson and exponential distributions. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SCITEPRESS user or Join INSTICC now for free.

Sign In SCITEPRESS user: please login.

Sign In INSTICC Members: please login. If not a member yet, Join INSTICC now for free.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.162.203.39. INSTICC members have higher download limits (free membership now)

In the current month:
Recent papers: 1 available of 1 total
2+ years older papers: 2 available of 2 total

Paper citation in several formats:
Martin M., Jiménez-Martín A. and Mateos A. (2017). The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study.In Proceedings of the 6th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES, ISBN 978-989-758-218-9, pages 75-84. DOI: 10.5220/0006186400750084

@conference{icores17,
author={Miguel Martin and Antonio Jiménez-Martín and Alfonso Mateos},
title={The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study},
booktitle={Proceedings of the 6th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES,},
year={2017},
pages={75-84},
publisher={ScitePress},
organization={INSTICC},
doi={10.5220/0006186400750084},
isbn={978-989-758-218-9},
}

TY - CONF

JO - Proceedings of the 6th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES,
TI - The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study
SN - 978-989-758-218-9
AU - Martin M.
AU - Jiménez-Martín A.
AU - Mateos A.
PY - 2017
SP - 75
EP - 84
DO - 10.5220/0006186400750084

Sorted by: Show papers

Note: The preferred Subjects/Areas/Topics, listed below for each paper, are those that match the selected paper topics and their ontology superclasses.
More...

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.

Show authors

Note: The preferred Subjects/Areas/Topics, listed below for each author, are those that more frequently used in the author's papers.
More...