LSTM-based Abstraction of Hetero Observation and Transition in Non-Communicative Multi-Agent Reinforcement Learning

Fumito Uwano

2022

Abstract

This study focuses on noncommunicative multiagent learning with hetero-information where agents observe each other in different resolutions of information. A new method is proposed for adapting the time dimension of the hetero-information from the observation by expanding the Asynchronous Advantage Actor–Critic (A3C) algorithm. The profit minimizing reinforcement learning with oblivion of memory mechanism was the previously used noncommunicative and cooperative learning method in multiagent reinforcement learning. We then insert an long short-term memory (LSTM) module into the A3C neural network to adapt to the time dimension influence of the hetero-information. The experiments investigate the performance of the proposed method on the hetero-information environment in terms of the effectiveness of LSTM. The experimental results show that: (1) the proposed method performs better than A3C. Without the LSTM module, the proposed method enabled the agents’ learning to converge. (2) LSTM can adapt the time dimension of the input information.

Download


Paper Citation


in Harvard Style

Uwano F. (2022). LSTM-based Abstraction of Hetero Observation and Transition in Non-Communicative Multi-Agent Reinforcement Learning. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-547-0, pages 172-179. DOI: 10.5220/0010795700003116


in Bibtex Style

@conference{icaart22,
author={Fumito Uwano},
title={LSTM-based Abstraction of Hetero Observation and Transition in Non-Communicative Multi-Agent Reinforcement Learning},
booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2022},
pages={172-179},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010795700003116},
isbn={978-989-758-547-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - LSTM-based Abstraction of Hetero Observation and Transition in Non-Communicative Multi-Agent Reinforcement Learning
SN - 978-989-758-547-0
AU - Uwano F.
PY - 2022
SP - 172
EP - 179
DO - 10.5220/0010795700003116