ACKNOWLEDGEMENTS
This research was funded by Fundação para a Ciência
e a Tecnologia, grant number SFRH/BD/145723
/2019 - UID/CEC/00127/2019.
REFERENCES
Ba, L. J., Kiros, J. R., & Hinton, G. E. (2016). Layer
Normalization. CoRR, abs/1607.06450.
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural
Machine Translation by Jointly Learning to Align and
Translate. In 3rd International Conference on Learning
Representations, ICLR 2015.
Bengio, Y. (2012). Practical recommendations for gradient-
based training of deep architectures. CoRR,
abs/1206.5533.
Bengio, Y., LeCun, Y., & Hinton, G. E. (2021). Deep
learning for AI. Commun. ACM, 64(7), 58–65.
Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,
Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI
Gym. CoRR, abs/1606.01540.
Graves, A., Mohamed, A., & Hinton, G. E. (2013). Speech
recognition with deep recurrent neural networks. In
IEEE International Conference on Acoustics, Speech
and Signal Processing, ICASSP 2013.
Ha, D., & Schmidhuber, J. (2018). World Models. CoRR,
abs/1803.10122.
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term
Memory. Neural Computation, 9(8), 1735–1780.
Humphrey, E. J., Bello, J. P., & LeCun, Y. (2012). Moving
Beyond Feature Design: Deep Architectures and
Automatic Feature Learning in Music Informatics. In
Proceedings of the 13th International Society for Music
Information Retrieval Conference, ISMIR 2012. (pp.
403–408).
Ioffe, S., & Szegedy, C. (2015). Batch Normalization:
Accelerating Deep Network Training by Reducing
Internal Covariate Shift. In Proceedings of the 32nd
International Conference on Machine Learning, ICML
2015. (Vol. 37, pp. 448–456). JMLR.org.
Kingma, D. P., & Ba, J. (2015). Adam: A Method for
Stochastic Optimization. In 3rd International
Conference on Learning Representations, ICLR 2015.
LeCun, Y., Bengio, Y., & Hinton, G. E. (2015). Deep
learning. Nat., 521(7553), 436–444.
Machado, M. C., Bellemare, M. G., Talvitie, E., Veness, J.,
Hausknecht, M. J., & Bowling, M. (2018). Revisiting
the Arcade Learning Environment: Evaluation
Protocols and Open Problems for General Agents. J.
Artif. Intell. Res., 61, 523–562.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T.
P., Harley, T., Silver, D., & Kavukcuoglu, K. (2016).
Asynchronous Methods for Deep Reinforcement
Learning. In Proceedings of the 33nd International
Conference on Machine Learning, ICML 2016. (Vol.
48, pp. 1928–1937). JMLR.org.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A.,
Veness, J., Bellemare, M. G., Graves, A., Riedmiller,
M. A., Fidjeland, A., Ostrovski, G., Petersen, S.,
Beattie, C., Sadik, A., Antonoglou, I., King, H.,
Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D.
(2015). Human-level control through deep
reinforcement learning. Nat.
, 518(7540), 529–533.
Mott, A., Zoran, D., Chrzanowski, M., Wierstra, D., &
Rezende, D. J. (2019). Towards Interpretable
Reinforcement Learning Using Attention Augmented
Agents. In Advances in Neural Information Processing
Systems 32: Annual Conference on Neural Information
Processing Systems, NeurIPS 2019. (pp. 12329–
12338).
Schulman, J., Moritz, P., Levine, S., Jordan, M. I., &
Abbeel, P. (2016). High-Dimensional Continuous
Control Using Generalized Advantage Estimation. In
4th International Conference on Learning
Representations, ICLR 2016,
Sermanet, P., Chintala, S., & LeCun, Y. (2012).
Convolutional neural networks applied to house
numbers digit classification. In Proceedings of the 21st
International Conference on Pattern Recognition,
ICPR 2012. (pp. 3288–3291).
Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K.,
& Woo, W. (2015). Convolutional LSTM Network: A
Machine Learning Approach for Precipitation
Nowcasting. In Advances in Neural Information
Processing Systems 28: Annual Conference on Neural
Information Processing Systems 2015. (pp. 802–810).
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L.,
van den Driessche, G., Schrittwieser, J., Antonoglou, I.,
Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe,
D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap,
T. P., Leach, M., Kavukcuoglu, K., Graepel, T., &
Hassabis, D. (2016). Mastering the game of Go with
deep neural networks and tree search. Nat., 529(7587),
484–489.
Sorokin, I., Seleznev, A., Pavlov, M., Fedorov, A., &
Ignateva, A. (2015). Deep Attention Recurrent Q-
Network. CoRR, abs/1512.01693.
Srivastava, N., Mansimov, E., & Salakhutdinov, R. (2015).
Unsupervised Learning of Video Representations using
LSTMs. In Proceedings of the 32nd International
Conference on Machine Learning, ICML 2015. (Vol.
37, pp. 843–852). JMLR.org.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017).
Attention is All you Need. In Advances in Neural
Information Processing Systems 30: Annual
Conference on Neural Information Processing Systems
2017. (pp. 5998–6008).
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A. C.,
Salakhutdinov, R., Zemel, R. S., & Bengio, Y. (2015).
Show, Attend and Tell: Neural Image Caption
Generation with Visual Attention. In Proceedings of the
32nd International Conference on Machine Learning,
ICML 2015. (Vol. 37, pp. 2048–2057). JMLR.org.
Zambaldi, V. F., Raposo, D., Santoro, A., Bapst, V., Li, Y.,
Babuschkin, I., Tuyls, K., Reichert, D. P., Lillicrap, T.