# STOCHASTIC CONTROL STRATEGIES AND ADAPTIVE CRITIC METHODS

### Randa Herzallah, David Lowe

#### 2008

#### Abstract

Adaptive critic methods have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, nonlinear and nonstationary environments. In this study, a novel probabilistic dual heuristic programming (DHP) based adaptive critic controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) adaptive critic method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterized by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the critic network is then calculated and shown to be equal to the analytically derived correct value.

#### References

- Botto, M. A., Wams, B., van den Boom, and da Costa, J. M. G. S. (2000). Robust stability of feedback linearised systems modelled with neural networks: Dealing with uncertainty. Engineering Applications of Artificial Intelligence, 13(6):659-670.
- Fabri, S. and Kadirkamanathan, V. (1998). Dual adaptive control of nonlinear stochastic systems using neural networks. Automatica, 34(2):245-253.
- Ge, S. S., Hang, C. C., Lee, T. H., and Zhang, T. (2001). Stable Adaptive Neural Network Control. Kluwer, Norwell, MA.
- Ge, S. S. and Wang, C. (2004). Adaptive neural control of uncertain mimo nonlinear systems. IEEE Transactions on Neural Networks, 15(3):674-692.
- Herzallah, R. (2007). Adaptive critic methods for stochastic systems with input-dependent noise. Automatica. Accepted to appear.
- Herzallah, R. and Lowe, D. A Bayesian perspective on stochastic neuro control. IEEE Transactions on Neural Networks. re-submited 2006.
- Herzallah, R. and Lowe, D. (2007). Distribution modeling of nonlinear inverse controllers under a Bayesian framework. IEEE Transactions on Neural Networks, 18:107-114.
- Hovakimyan, N., Nardi, F., and Calise, A. J. (2001). A novel observer based adaptive output feedback approach for control of uncertain systems. In Proceedings of the American Control Conference, volume 3, pages 2444-2449, Arlington, VA, USA.
- Howard, R. A. (1960). Dynamic Programming and Markov Processes. The Massachusetts Institute of Technology and John Wiley and Sons, Inc., New York. London.
- Karny, M. (1996). Towards fully probabilistic control design. Automatica, 32(12):1719-1722.
- Lewis, F. L., Yesildirek, A., and Liu, K. (2000). Robust backstepping control of induction motors using neural netwoks. IEEE Transactions on Neural Networks, 11:1178-1187.
- Mine, H. and Osaki, S., editors (1970). Markovian Decision Processes. Elsevier, New York, N.Y.
- Murray-Smith, R. and Sbarbaro, D. (2002). Nonlinear adaptive control using non-parametric gaussian process prior models. In 15th IFAC Triennial World Congress, Barcelona.
- Sanner, R. M. and Slotine, J. J. E. (1992). Gaussian networks for direct adaptive control. IEEE Transactions on Neural Networks, 3(6).
- Sastry, S. S. and Isidori, A. (1989). Adaptive control of linearizable systems. IEEE Transactions on Automatic Control, 34(11):1123-1131.
- Wang, D. and Huang, J. (2005). Neural network-based adaptive dynamic surface control for a class of uncertain nonlinear systems in strict-feedback form. IEEE Transactions on Neural Networks, 16(1):195-202.
- Wang, H. (2002). Minimum entropy control of nongaussian dynamic stochastic systems. IEEE Transactions on Automatic Control, 47(2):398-403.
- Wang, H. and Zhang, J. (2001). Bounded stochastic distribution control for pseudo armax stochastic systems. IEEE Transactions on Automatic Control, 46(3):486- 490.
- Werbos, P. J. (1992). Approximate dynamic programming for real-time control and neural modeling. In White, D. A. and Sofge, D. A., editors, Handbook of Intillegent Control, chapter 13, pages 493-526. Multiscience Press, Inc, New York, N.Y.
- Zhang, Y., Peng, P. Y., and Jiang, Z. P. (2000). Stable neural controller design for unknown nonlinear systems using backstepping. IEEE Transactions on Neural Networks, 11:1347-1359.

#### Paper Citation

#### in Harvard Style

Herzallah R. and Lowe D. (2008). **STOCHASTIC CONTROL STRATEGIES AND ADAPTIVE CRITIC METHODS** . In *Proceedings of the Fifth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,* ISBN 978-989-8111-30-2, pages 281-288. DOI: 10.5220/0001481902810288

#### in Bibtex Style

@conference{icinco08,

author={Randa Herzallah and David Lowe},

title={STOCHASTIC CONTROL STRATEGIES AND ADAPTIVE CRITIC METHODS},

booktitle={Proceedings of the Fifth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},

year={2008},

pages={281-288},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0001481902810288},

isbn={978-989-8111-30-2},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the Fifth International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,

TI - STOCHASTIC CONTROL STRATEGIES AND ADAPTIVE CRITIC METHODS

SN - 978-989-8111-30-2

AU - Herzallah R.

AU - Lowe D.

PY - 2008

SP - 281

EP - 288

DO - 10.5220/0001481902810288