Fuzzy Rewards on the Multi-Armed Bandits Model
Ciria R. Briones-García, Raúl Montes-de-Oca, Víctor H. Vázquez-Guevara, Hugo Cruz-Suárez
2025
Abstract
In this paper an extension of the Armed Bandits problem is considered under the possibility that reward functions take trapezoidal fuzzy values as the results of a fuzzy affine transformation (which is susceptible of being interpreted as receiving “approximately” a reward located in some interval instead of such reward itself). The main objective is to find an optimal selection strategy that maximizes the fuzzy total expected discounted reward with respect to the partial order based on α-cuts and the one provided by the average ranking. For this, it is obtained that Gittins strategy (that is optimal in the crisp setting) is still optimal at the fuzzy paradigm. In addition, it is found that optimal stopping time associated with crisp Gittins index is the same for its fuzzy counterpart by finding a link between the fuzzy and crisp versions of Gittins index which leads us to demonstrate that fuzzy value function is connected to its crisp analog via some fuzzy affine transformation, with this in mind, it is possible to ensure that value function is approximately in certain interval related to the fuzzy transformation.
DownloadPaper Citation
in Harvard Style
Briones-García C., Montes-de-Oca R., Vázquez-Guevara V. and Cruz-Suárez H. (2025). Fuzzy Rewards on the Multi-Armed Bandits Model. In Proceedings of the 14th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES; ISBN 978-989-758-732-0, SciTePress, pages 271-277. DOI: 10.5220/0013160600003893
in Bibtex Style
@conference{icores25,
author={Ciria Briones-García and Raúl Montes-de-Oca and Víctor Vázquez-Guevara and Hugo Cruz-Suárez},
title={Fuzzy Rewards on the Multi-Armed Bandits Model},
booktitle={Proceedings of the 14th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES},
year={2025},
pages={271-277},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013160600003893},
isbn={978-989-758-732-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 14th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES
TI - Fuzzy Rewards on the Multi-Armed Bandits Model
SN - 978-989-758-732-0
AU - Briones-García C.
AU - Montes-de-Oca R.
AU - Vázquez-Guevara V.
AU - Cruz-Suárez H.
PY - 2025
SP - 271
EP - 277
DO - 10.5220/0013160600003893
PB - SciTePress