# A NEW REINFORCEMENT SCHEME FOR STOCHASTIC LEARNING AUTOMATA - Application to Automatic Control

### Florin Stoica, Emil M. Popa, Iulian Pah

#### 2008

#### Abstract

A Learning Automaton is a learning entity that learns the optimal action to use from its set of possible actions. It does this by performing actions toward an environment and analyzes the resulting response. The response, being both good and bad, results in behaviour change to the automaton (the automaton will learn based on this response). This behaviour change is often called reinforcement algorithm. The term stochastic emphasizes the adaptive nature of the automaton: environment output is stochastically related to the automaton action. The reinforcement scheme presented in this paper is shown to satisfy all necessary and sufficient conditions for absolute expediency for a stationary environment. An automaton using this scheme is guaranteed to „do better” at every time step than at the previous step. Some simulation results are presented, which prove that our algorithm converges to a solution faster than one previously defined in (Ünsal, 1999). Using Stochastic Learning Automata techniques, we introduce a decision/control method for intelligent vehicles, in infrastructure managed architecture. The aim is to design an automata system that can learn the best possible action based on the data received from on-board sensors or from the localization system of highway infrastructure. A multi-agent approach is used for effective implementation. Each vehicle has associated a “driver” agent, hosted on a JADE platform.

#### References

- Baba, N., 1984. New Topics in Learning Automata: Theory and Applications, Lecture Notes in Control and Information Sciences Berlin, Germany: SpringerVerlag.
- Barto, A., Mahadevan, S., 2003. Recent advances in hierarchical reinforcement learning, Discrete-Event Systems journal, Special issue on Reinforcement Learning.
- Bigus, J. P., Bigus, J., 2001. Constructing Intelligent Agents using Java, 2nd ed., John Wiley & Sons, Inc.
- Buffet, O., Dutech, A., Charpillet, F., 2001. Incremental reinforcement learning for designing multi-agent systems, In J. P. Müller, E. Andre, S. Sen, and C. Frasson, editors, Proceedings of the Fifth International Conference onAutonomous Agents, pp. 31-32, Montreal, Canada, ACM Press.
- Lakshmivarahan, S., Thathachar, M.A.L., 1973. Absolutely Expedient Learning Algorithms for Stochastic Automata, IEEE Transactions on Systems, Man and Cybernetics, vol. SMC-6, pp. 281-286.
- Moody, J., Liu, Y., Saffell, M., Youn, K., 2004. Stochastic direct reinforcement: Application to simple games with recurrence, In Proceedings of Artificial Multiagent Learning. Papers from the 2004 AAAI Fall Symposium,Technical Report FS-04-02.
- Narendra, K. S., Thathachar, M. A. L., 1989. Learning Automata: an introduction, Prentice-Hall.
- Rivero, C., 2003. Characterization of the absolutely expedient learning algorithms for stochastic automata in a non-discrete space of actions, ESANN'2003 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), ISBN 2-930307- 03-X, pp. 307-312
- Stoica, F., Popa, E. M., 2007. An Absolutely Expedient Learning Algorithm for Stochastic Automata, WSEAS Transactions on Computers, Issue 2, Volume 6, ISSN 1109-2750, pp. 229-235.
- Sutton, R., Barto, A., 1998. Reinforcement learning: An introduction, MIT-press, Cambridge, MA.
- Ünsal, C., Kachroo, P., Bay, J. S., 1999. Multiple Stochastic Learning Automata for Vehicle Path Control in an Automated Highway System, IEEE Transactions on Systems, Man, and Cybernetics -part A: systems and humans, vol. 29, no. 1, january 1999.
- Figure 3: The class diagram of the simulator.

#### Paper Citation

#### in Harvard Style

Stoica F., M. Popa E. and Pah I. (2008). **A NEW REINFORCEMENT SCHEME FOR STOCHASTIC LEARNING AUTOMATA - Application to Automatic Control** . In *Proceedings of the International Conference on e-Business - Volume 1: ICE-B, (ICETE 2008)* ISBN 978-989-8111-58-6, pages 45-50. DOI: 10.5220/0001909800450050

#### in Bibtex Style

@conference{ice-b08,

author={Florin Stoica and Emil M. Popa and Iulian Pah},

title={A NEW REINFORCEMENT SCHEME FOR STOCHASTIC LEARNING AUTOMATA - Application to Automatic Control},

booktitle={Proceedings of the International Conference on e-Business - Volume 1: ICE-B, (ICETE 2008)},

year={2008},

pages={45-50},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0001909800450050},

isbn={978-989-8111-58-6},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the International Conference on e-Business - Volume 1: ICE-B, (ICETE 2008)

TI - A NEW REINFORCEMENT SCHEME FOR STOCHASTIC LEARNING AUTOMATA - Application to Automatic Control

SN - 978-989-8111-58-6

AU - Stoica F.

AU - M. Popa E.

AU - Pah I.

PY - 2008

SP - 45

EP - 50

DO - 10.5220/0001909800450050