AUTOMATIC STATE SPACE AGGREGATION USING A DENSITY BASED TECHNIQUE

Steven Loscalzo, Robert Wright

Abstract

Applying reinforcement learning techniques in continuous environments is challenging because there are infinitely many states to visit in order to learn an optimal policy. To make this situation tractable, abstractions are often used to reduce the infinite state space down to a small and finite one. Some of the more powerful and commonplace abstractions, tiling abstractions such as CMAC, work by aggregating many base states into a single abstract state. Unfortunately, significant manual effort is often necessary in order to apply them to nontrivial control problems. Here we develop an automatic state space aggregation algorithm, Maximum Density Separation, which can produce a meaningful abstraction with minimal manual effort. This method leverages the density of observations in the space to construct a partition and aggregate states in a dense region to the same abstract state. We show that the abstractions produced by this method on two benchmark reinforcement learning problems can outperform fixed tiling methods in terms of both the convergence rate of a learning algorithm and the number of abstract states needed.

References

  1. Boyan, J. A. and Moore, A. W. (1995). Generalization in reinforcement learning: Safely approximating the value function. In Advances in Neural Information Processing Systems 7, pages 369-376. MIT Press.
  2. Comaniciu, D. and Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24(5):603-619.
  3. Gomez, F., Schmidhuber, J., and Miikkulainen, R. (2006). Efficient non-linear control through neuroevolution. In Proceedings of the European Conference on Machine Learning, pages 654-662.
  4. Gomez, F. J. and Miikkulainen, R. (1999). Solving nonmarkovian control tasks with neuroevolution. In In Proceedings of the 16th International Joint Conference on Artificial Intelligence, pages 1356-1361. Morgan Kaufmann.
  5. James, D. and Tucker, P. (2004). A comparative analysis of simplification and complexification in the evolution of neural network topologies. In Proceedings of the 2004 Conference on Genetic and Evoluationary Computation (GECCO-04). Springer.
  6. Li, L., Walsh, T. J., and Littman, M. L. (2006). Towards a unified theory of state abstraction for mdps. In Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics, pages 531-539.
  7. Mccallum, A. K. (1996). Reinforcement learning with selective perception and hidden state. PhD thesis, The University of Rochester. Supervisor-Ballard, Dana.
  8. Miller, W.T., I., Glanz, F., and Kraft, L.G., I. (1990). Cmas: an associative neural network alternative to backpropagation. Proceedings of the IEEE, 78(10):1561 -1567.
  9. Mitchell, T. (1997). Machine Learning. McGraw Hill.
  10. Stanley, K. O. and Miikkulainen, R. (2002). Efficient reinforcement learning through evolving neural network topologies. In GECCO 7802: Proceedings of the Genetic and Evolutionary Computation Conference, pages 569-577.
  11. Sutton, R. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems, volume 8, pages 1038-1044. MIT Press.
  12. Uther, W. T. B. and Veloso, M. M. (1998). Tree based discretization for continuous state space reinforcement learning. In AAAI 7898/IAAI 7898: Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, pages 769-774, Menlo Park, CA, USA. American Association for Artificial Intelligence.
  13. Whiteson, S., Taylor, M. E., and Stone, P. (2007). Adaptive tile coding for value function approximation. Technical Report AI-TR-07-339, University of Texas at Austin.
  14. Wright, R. and Gemelli, N. (2009). State aggregation for reinforcement learning using neuroevolution. In ICAART 2009 - Proceedings of the International Conference on Agents and Artificial Intelligence, pages 45-52.
Download


Paper Citation


in Harvard Style

Loscalzo S. and Wright R. (2011). AUTOMATIC STATE SPACE AGGREGATION USING A DENSITY BASED TECHNIQUE . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 249-256. DOI: 10.5220/0003150202490256


in Bibtex Style

@conference{icaart11,
author={Steven Loscalzo and Robert Wright},
title={AUTOMATIC STATE SPACE AGGREGATION USING A DENSITY BASED TECHNIQUE},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={249-256},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003150202490256},
isbn={978-989-8425-40-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - AUTOMATIC STATE SPACE AGGREGATION USING A DENSITY BASED TECHNIQUE
SN - 978-989-8425-40-9
AU - Loscalzo S.
AU - Wright R.
PY - 2011
SP - 249
EP - 256
DO - 10.5220/0003150202490256