Learning Optimal Behavior in Environments with Non-stationary Observations

Ilio Boone; Gavin Rens

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Learning Optimal Behavior in Environments with Non-stationary Observations

Topics: Machine Learning; Planning and Scheduling; Uncertainty in AI

In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, 729-736, 2022

Authors: Ilio Boone and Gavin Rens

Affiliation: DTAI group, KU Leuven, Belgium

Keyword(s): Markov Decision Process, Non-Markovian Reward Models, Mealy Reward Model (MRM), Learning MRMs, Non-stationary.

Abstract: In sequential decision-theoretic systems, the dynamics might be Markovian (behavior in the next step is independent of the past, given the present), or non-Markovian (behavior in the next step depends on the past). One approach to represent non-Markovian behaviour has been to employ deterministic finite automata (DFA) with inputs and outputs (e.g. Mealy machines). Moreover, some researchers have proposed frameworks for learning DFA-based models. There are at least two reasons for a system to be non-Markovian: (i) rewards are gained from temporally-dependent tasks, (ii) observations are non-stationary. Rens et al. (2021) tackle learning the applicable DFA for the first case with their ARM algorithm. ARM cannot deal with the second case. Toro Icarte et al. (2019) tackle the problem for the second case with their LRM algorithm. In this paper, we extend ARM to deal with the second case too. The advantage of ARM for learning and acting in non-Markovian systems is that it is based on well- understood formal methods with many available tools. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.144.189.177

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Boone, I. and Rens, G. (2022). Learning Optimal Behavior in Environments with Non-stationary Observations. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-547-0; ISSN 2184-433X, SciTePress, pages 729-736. DOI: 10.5220/0010898200003116

@conference{icaart22,
author={Ilio Boone. and Gavin Rens.},
title={Learning Optimal Behavior in Environments with Non-stationary Observations},
booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2022},
pages={729-736},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010898200003116},
isbn={978-989-758-547-0},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Learning Optimal Behavior in Environments with Non-stationary Observations
SN - 978-989-758-547-0
IS - 2184-433X
AU - Boone, I.
AU - Rens, G.
PY - 2022
SP - 729
EP - 736
DO - 10.5220/0010898200003116
PB - SciTePress