First Go, then Post-Explore: The Benefits of Post-Exploration in Intrinsic Motivation

Zhao Yang, Thomas Moerland, Mike Preuss, Aske Plaat

2023

Abstract

Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards. The key insight of Go-Explore was that successful exploration requires an agent to first return to an interesting state (‘Go’), and only then explore into unknown terrain (‘Explore’). We refer to such exploration after a goal is reached as ‘post-exploration’. In this paper, we present a clear ablation study of post-exploration in a general intrinsically motivated goal exploration process (IMGEP) framework, that the Go-Explore paper did not show. We study the isolated potential of post-exploration, by turning it on and off within the same algorithm under both tabular and deep RL settings on both discrete navigation and continuous control tasks. Experiments on a range of MiniGrid and Mujoco environments show that post-exploration indeed helps IMGEP agents reach more diverse states and boosts their performance. In short, our work suggests that RL researchers should consider using post-exploration in IMGEP when possible since it is effective, method-agnostic, and easy to implement.

Download


Paper Citation


in Harvard Style

Yang Z., Moerland T., Preuss M. and Plaat A. (2023). First Go, then Post-Explore: The Benefits of Post-Exploration in Intrinsic Motivation. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-623-1, pages 27-34. DOI: 10.5220/0011612800003393


in Bibtex Style

@conference{icaart23,
author={Zhao Yang and Thomas Moerland and Mike Preuss and Aske Plaat},
title={First Go, then Post-Explore: The Benefits of Post-Exploration in Intrinsic Motivation},
booktitle={Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2023},
pages={27-34},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011612800003393},
isbn={978-989-758-623-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - First Go, then Post-Explore: The Benefits of Post-Exploration in Intrinsic Motivation
SN - 978-989-758-623-1
AU - Yang Z.
AU - Moerland T.
AU - Preuss M.
AU - Plaat A.
PY - 2023
SP - 27
EP - 34
DO - 10.5220/0011612800003393