A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning

Toshihiro Matsui; Hiroshi Matsuo

doi:10.5220/0006203800880095

A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning

Toshihiro Matsui, Hiroshi Matsuo

2017

Abstract

Reinforcement learning has been studied for cooperative learning and optimization methods in multiagent systems. In several frameworks of multiagent reinforcement learning, the system’s whole problem is decomposed into local problems for agents. To choose an appropriate cooperative action, the agents perform an optimization method that can be performed in a distributed manner. While the conventional goal of the learning is the maximization of the total rewards among agents, in practical resource allocation problems, unfairness among agents is critical. In several recent studies of decentralized optimization methods, unfairness was considered a criterion. We address an action selection method based on leximin criteria, which reduces the unfairness among agents, in decentralized reinforcement learning. We experimentally evaluated the effects and influences of the proposed approach on classes of sensor network problems.

References

Bouveret, S. and Lemaˆitre, M. (2009). Computing leximinoptimal solutions in constraint networks. Artificial Intelligence, 173(2):343-364.
Farinelli, A., Rogers, A., Petcu, A., and Jennings, N. R. (2008). Decentralised coordination of low-power embedded devices using the max-sum algorithm. In 7th International Joint Conference on Autonomous Agents and Multiagent Systems, pages 639-646.
Hu, J. and Wellman, M. P. (2003). Nash q-learning for general-sum stochastic games. J. Mach. Learn. Res., 4:1039-1069.
Matsui, T. and Matsuo, H. (2014). Complete distributed search algorithm for cyclic factor graphs. In 6th International Conference on Agents and Artificial Intelligence, pages 184-192.
Matsui, T., Silaghi, M., Hirayama, K., Yokoo, M., and Matsuo, H. (2014). Leximin multiple objective optimization for preferences of agents. In 17th International Conference on Principles and Practice of Multi-Agent Systems, pages 423-438.
Matsui, T., Silaghi, M., Okimoto, T., Hirayama, K., Yokoo, M., and Matsuo, H. (2015). Leximin asymmetric multiple objective DCOP on factor graph. In Principles and Practice of Multi-Agent Systems - 18th International Conference, pages 134-151.
Modi, P. J., Shen, W., Tambe, M., and Yokoo, M. (2005). Adopt: Asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence, 161(1-2):149-180.
Moulin, H. (1988). Axioms of Cooperative Decision Making. Cambridge : Cambridge University Press.
Netzer, A. and Meisels, A. (2013a). Distributed Envy Minimization for Resource Allocation. In 5th International Conference on Agents and Artificial Intelligence, volume 1, pages 15-24.
Netzer, A. and Meisels, A. (2013b). Distributed Local Search for Minimizing Envy. In 2013 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, pages 53-58.
Nguyen, D. T., Yeoh, W., Lau, H. C., Zilberstein, S., and Zhang, C. (2014). Decentralized multi-agent reinforcement learning in average-reward dynamic dcops. In 28th AAAI Conference on Artificial Intelligence, pages 1447-1455.
Petcu, A. and Faltings, B. (2005). A scalable method for multiagent constraint optimization. In 19th International Joint Conference on Artificial Intelligence, pages 266-271.
Zhang, C. and Lesser, V. (2011). Coordinated multiagent reinforcement learning in networked distributed pomdps. In 25th AAAI Conference on Artificial Intelligence, pages 764-770.
Zivan, R. (2008). Anytime local search for distributed constraint optimization. In Twenty-Third AAAI Conference on Artificial Intelligence, pages 393-398.

Download

Paper Citation

in Harvard Style

Matsui T. and Matsuo H. (2017). A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning . In Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-219-6, pages 88-95. DOI: 10.5220/0006203800880095

in Bibtex Style

@conference{icaart17,
author={Toshihiro Matsui and Hiroshi Matsuo},
title={A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning},
booktitle={Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2017},
pages={88-95},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006203800880095},
isbn={978-989-758-219-6},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning
SN - 978-989-758-219-6
AU - Matsui T.
AU - Matsuo H.
PY - 2017
SP - 88
EP - 95
DO - 10.5220/0006203800880095