A History-based Framework for Online Continuous Action Ensembles in Deep Reinforcement Learning

Renata Oliveira, Wouter Caarls

Abstract

This work seeks optimized techniques of action ensemble deep reinforcement learning to decrease the hyperparameter tuning effort as well as improve performance and robustness, while avoiding parallel environments to make the system applicable to real-world robotic applications. The approach is a history-based framework where different DDPG policies are trained online. The framework’s contributions lie in maintaining a temporal moving average of policy scores, and selecting the actions of the best scoring policies using a single environment. To measure the sensitivity of the ensemble algorithm to the hyperparameter settings, groups were created that mix different amounts of good and bad DDPG parameterizations. The bipedal robot half cheetah environment validated the framework’s best strategy surpassing the baseline by 45%, even with not all good hyperparameters. It presented overall lower variance and superior results with mostly bad parameterization.

Download


Paper Citation


in Harvard Style

Oliveira R. and Caarls W. (2021). A History-based Framework for Online Continuous Action Ensembles in Deep Reinforcement Learning.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 580-588. DOI: 10.5220/0010199005800588


in Bibtex Style

@conference{icaart21,
author={Renata Oliveira and Wouter Caarls},
title={A History-based Framework for Online Continuous Action Ensembles in Deep Reinforcement Learning},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={580-588},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010199005800588},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - A History-based Framework for Online Continuous Action Ensembles in Deep Reinforcement Learning
SN - 978-989-758-484-8
AU - Oliveira R.
AU - Caarls W.
PY - 2021
SP - 580
EP - 588
DO - 10.5220/0010199005800588