Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning

Alexander Hill, Marc Groefsema, Matthia Sabatelli, Raffaella Carloni, Marco Grzegorczyk

2024

Abstract

This paper proposes a novel method of utilising guide policies in Reinforcement Learning problems; Contextual Online Imitation Learning (COIL). This paper demonstrates that COIL can offer improved performance over both offline Imitation Learning methods such as Behavioral Cloning, and also Reinforcement Learning algorithms such as Proximal Policy Optimisation which do not take advantage of existing guide policies. An important characteristic of COIL is that it can effectively utilise guide policies that exhibit expert behavior in only a strict subset of the state space, making it more flexible than classical methods of Imitation Learning. This paper demonstrates that through using COIL, guide policies that achieve good performance in sub-tasks can also be used to help Reinforcement Learning agents looking to solve more complex tasks. This is a significant improvement in flexibility over traditional Imitation Learning methods. After introducing the theory and motivation behind COIL, this paper tests the effectiveness of COIL on the task of mobile-robot navigation in both a simulation and real-life lab experiments. In both settings, COIL gives stronger results than offline Imitation Learning, Reinforcement Learning, and also the guide policy itself.

Download


Paper Citation


in Harvard Style

Hill A., Groefsema M., Sabatelli M., Carloni R. and Grzegorczyk M. (2024). Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 178-185. DOI: 10.5220/0012312700003636


in Bibtex Style

@conference{icaart24,
author={Alexander Hill and Marc Groefsema and Matthia Sabatelli and Raffaella Carloni and Marco Grzegorczyk},
title={Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={178-185},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012312700003636},
isbn={978-989-758-680-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Contextual Online Imitation Learning (COIL): Using Guide Policies in Reinforcement Learning
SN - 978-989-758-680-4
AU - Hill A.
AU - Groefsema M.
AU - Sabatelli M.
AU - Carloni R.
AU - Grzegorczyk M.
PY - 2024
SP - 178
EP - 185
DO - 10.5220/0012312700003636
PB - SciTePress