Fuzzy Rewards on the Multi-Armed Bandits Model

Ciria  R. Briones-García; Raúl Montes-de-Oca; Víctor  H. Vázquez-Guevara; Hugo Cruz-Suárez

doi:10.5220/0013160600003893

Fuzzy Rewards on the Multi-Armed Bandits Model

Ciria R. Briones-García, Raúl Montes-de-Oca, Víctor H. Vázquez-Guevara, Hugo Cruz-Suárez

2025

Abstract

In this paper an extension of the Armed Bandits problem is considered under the possibility that reward functions take trapezoidal fuzzy values as the results of a fuzzy affine transformation (which is susceptible of being interpreted as receiving “approximately” a reward located in some interval instead of such reward itself). The main objective is to find an optimal selection strategy that maximizes the fuzzy total expected discounted reward with respect to the partial order based on α-cuts and the one provided by the average ranking. For this, it is obtained that Gittins strategy (that is optimal in the crisp setting) is still optimal at the fuzzy paradigm. In addition, it is found that optimal stopping time associated with crisp Gittins index is the same for its fuzzy counterpart by finding a link between the fuzzy and crisp versions of Gittins index which leads us to demonstrate that fuzzy value function is connected to its crisp analog via some fuzzy affine transformation, with this in mind, it is possible to ensure that value function is approximately in certain interval related to the fuzzy transformation.

Download

Paper Citation

in Harvard Style

Briones-García C., Montes-de-Oca R., Vázquez-Guevara V. and Cruz-Suárez H. (2025). Fuzzy Rewards on the Multi-Armed Bandits Model. In Proceedings of the 14th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES; ISBN 978-989-758-732-0, SciTePress, pages 271-277. DOI: 10.5220/0013160600003893

in Bibtex Style

@conference{icores25,
author={Ciria Briones-García and Raúl Montes-de-Oca and Víctor Vázquez-Guevara and Hugo Cruz-Suárez},
title={Fuzzy Rewards on the Multi-Armed Bandits Model},
booktitle={Proceedings of the 14th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES},
year={2025},
pages={271-277},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013160600003893},
isbn={978-989-758-732-0},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Operations Research and Enterprise Systems - Volume 1: ICORES
TI - Fuzzy Rewards on the Multi-Armed Bandits Model
SN - 978-989-758-732-0
AU - Briones-García C.
AU - Montes-de-Oca R.
AU - Vázquez-Guevara V.
AU - Cruz-Suárez H.
PY - 2025
SP - 271
EP - 277
DO - 10.5220/0013160600003893
PB - SciTePress