A Mutual Influence-based Learning Algorithm

Stefan Rudolph, Sven Tomforde, Jörg Hähner

Abstract

Robust and optimized agent behavior can be achieved by allowing for learning mechanisms within the underlying adaptive control strategies. Therefore, a classic feedback loop concept is used that chooses the best action for an observed situation – and learns the success by analyzing the achieved performance. This typically reflects only the local scope of an agent and neglects the existence of other agents with impact on the reward calculation. However, there are significant mutual influences among agents population. For instance, the success of a Smart Camera’s control strategy depends (in terms of person detection or 3D-reconstruction) largely on the current strategy performed by its spatially neighbors. In this paper, we compare two concepts to consider such influences within the adaptive control strategy: Distributed W-Learning and Q-Learning in combination with mutual influence detection. We demonstrate that the performance can be improved significantly, if taking detected influences into account.

References

  1. Bishop, J. and Miikkulainen, R. (2013). Evolutionary feature evaluation for online reinforcement learning. In Computational Intelligence in Games (CIG), 2013 IEEE Conference on, pages 1-8.
  2. Broersen, J. M. (2010). CTL.STIT: enhancing ATL to express important multi-agent system verification properties. In 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), Toronto, Canada, May 10-14, 2010, Volume 1- 3, pages 683-690.
  3. Dusparic, I. and Cahill, V. (2010). Distributed w-learning: Multi-policy optimization in self-organizing systems. In Proceedings of the 3rd IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO). IEEE.
  4. Hachiya, H. and Sugiyama, M. (2010). Feature selection for reinforcement learning: Evaluating implicit statereward dependency via conditional mutual information. In Balczar, J. L., Bonchi, F., Gionis, A., and Sebag, M., editors, ECML/PKDD (1), volume 6321 of Lecture Notes in Computer Science, pages 474-489. Springer.
  5. Humphrys, M. (1995). W-learning: Competition among selfish q-learners. Technical report.
  6. Keil, D. and Goldin, D. Q. (2003). Modeling indirect interaction in open computational systems. In 12th IEEE International Workshops on Enabling Technologies (WETICE 2003), Infrastructure for Collaborative Enterprises, 9-11 June 2003, Linz, Austria, pages 371-376.
  7. Logie, R., Hall, J. G., and Waugh, K. G. (2008). Towards mining for influence in a multi agent environment. In Abraham, A., editor, IADIS European Conf. Data Mining, pages 97-101. IADIS.
  8. Menze, M., Klinger, T., Muhle, D., Metzler, J., and Heipke, C. (2013). A stereoscopic approach for the association of people tracks in video surveillance systems. PFG Photogrammetrie, Fernerkundung, Geoinformation, 2013(2):83-92.
  9. Menze, M. and Muhle, D. (2012). Using Stereo Vision to Support the Automated Analysis of Surveillance Videos. ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pages 47-51.
  10. Nguyen, T., Li, Z., Silander, T., and Leong, T. Y. (2013). Online feature selection for model-based reinforcement learning. In Dasgupta, S. and Mcallester, D., editors, Proceedings of the 30th International Conference on Machine Learning (ICML-13), volume 28, pages 498-506. JMLR Workshop and Conference Proceedings.
  11. Olofsson, P. (2011). Probability, Statistics, and Stochastic Processes. Wiley.
  12. Panait, L. and Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3):387-434.
  13. Parr, R., Li, L., Taylor, G., Painter-Wakefield, C., and Littman, M. L. (2008). An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In Proceedings of the 25th International Conference on Machine Learning, ICML 7808, pages 752-759, New York, NY, USA. ACM.
  14. Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58(1):240 - 242.
  15. Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., Lander, E. S., Mitzenmacher, M., and Sabeti, P. C. (2011). Detecting novel associations in large data sets. Science, 334(6062):1518-1524.
  16. Rinner, B., Winkler, T., Schriebl, W., Quaritsch, M., and Wolf, W. (2008). The evolution from single to pervasive smart cameras. In Distributed Smart Cameras, 2008. ICDSC 2008. Second ACM/IEEE International Conference on, pages 1-10.
  17. Rudolph, S., Edenhofer, S., Tomforde, S., and Hähner, J. (2014). Reinforcement learning for coverage optimization through ptz camera alignment in highly dynamic environments. In Proceedings of the International Conference on Distributed Smart Cameras, ICDSC 7814, pages 19:1-19:6, New York, NY, USA. ACM.
  18. Rudolph, S., Tomforde, S., Sick, B., and Hähner, J. (2015a). A mutual influence detection algorithm for systems with local performance measurement. In Proceedings of the 9th IEEE Conference on Self-Adaptive and SelfOrganizing Systems (SASO). IEEE Press, to appear.
  19. Rudolph, S., Tomforde, S., Sick, B., Heck, H., Wacker, A., and Hähner, J. (2015b). An online influence detection algorithm for organic computing systems. In Proceedings of the 28th GI/ITG International Conference on Architecture of Computing Systems ARCS Workshops.
  20. Shannon, C. and Weaver, W. (1949). The Mathematical Theory of Communication. University of Illinois Press.
  21. Stone, P. and Veloso, M. (2000). Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8(3):345-383.
  22. Szkely, G. J., Rizzo, M. L., and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Ann. Statist., 35(6):2769-2794.
  23. Valera, M. and Velastin, S. (2005). Intelligent distributed surveillance systems: a review. Vision, Image and Signal Processing, IEE Proceedings -, 152(2):192-204.
  24. Watkins, C. J. C. H. and Dayan, P. (1992). Technical note q-learning. Machine Learning, 8:279-292.
  25. Wiering, M. and van Otterlo, M., editors (2012). Reinforcement Learning: State-of-the-Art (Adaptation, Learning, and Optimization). Springer Verlag, Berlin / Heidelberg, Germany. ISBN-13: 978-3642276446.
Download


Paper Citation


in Harvard Style

Rudolph S., Tomforde S. and Hähner J. (2016). A Mutual Influence-based Learning Algorithm . In Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-172-4, pages 181-189. DOI: 10.5220/0005697001810189


in Bibtex Style

@conference{icaart16,
author={Stefan Rudolph and Sven Tomforde and Jörg Hähner},
title={A Mutual Influence-based Learning Algorithm},
booktitle={Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2016},
pages={181-189},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005697001810189},
isbn={978-989-758-172-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A Mutual Influence-based Learning Algorithm
SN - 978-989-758-172-4
AU - Rudolph S.
AU - Tomforde S.
AU - Hähner J.
PY - 2016
SP - 181
EP - 189
DO - 10.5220/0005697001810189