LEARNING TO PLAY K-ARMED BANDIT PROBLEMS

Francis Maes; Louis Wehenkel; Damien Ernst

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

LEARNING TO PLAY K-ARMED BANDIT PROBLEMS

Topics: Uncertainty in AI

In Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, 74-81, 2012 , Vilamoura, Algarve, Portugal

Authors: Francis Maes ; Louis Wehenkel and Damien Ernst

Affiliation: University of Liège, Belgium

Keyword(s): Multi-armed bandit problems, Reinforcement learning, Exploration-exploitation dilemma.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Symbolic Systems ; Uncertainty in AI

Abstract: We propose a learning approach to pre-compute K-armed bandit playing policies by exploiting prior information describing the class of problems targeted by the player. Our algorithm first samples a set of K-armed bandit problems from the given prior, and then chooses in a space of candidate policies one that gives the best average performances over these problems. The candidate policies use an index for ranking the arms and pick at each play the arm with the highest index; the index for each arm is computed in the form of a linear combination of features describing the history of plays (e.g., number of draws, average reward, variance of rewards and higher order moments), and an estimation of distribution algorithm is used to determine its optimal parameters in the form of feature weights. We carry out simulations in the case where the prior assumes a fixed number of Bernoulli arms, a fixed horizon, and uniformly distributed parameters of the Bernoulli arms. These simulations show that learned strategies perform very well with respect to several other strategies previously proposed in the literature (UCB1, UCB2, UCB-V, KL-UCB and en-GREEDY); they also highlight the robustness of these strategies with respect to wrong prior information. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.138.33.178

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Maes, F.; Wehenkel, L. and Ernst, D. (2012). LEARNING TO PLAY K-ARMED BANDIT PROBLEMS. In Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-8425-95-9; ISSN 2184-433X, SciTePress, pages 74-81. DOI: 10.5220/0003733500740081

@conference{icaart12,
author={Francis Maes. and Louis Wehenkel. and Damien Ernst.},
title={LEARNING TO PLAY K-ARMED BANDIT PROBLEMS},
booktitle={Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2012},
pages={74-81},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003733500740081},
isbn={978-989-8425-95-9},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - LEARNING TO PLAY K-ARMED BANDIT PROBLEMS
SN - 978-989-8425-95-9
IS - 2184-433X
AU - Maes, F.
AU - Wehenkel, L.
AU - Ernst, D.
PY - 2012
SP - 74
EP - 81
DO - 10.5220/0003733500740081
PB - SciTePress