
and revealed several defects. Although the scenario-
based testing approach has only been experimented
with OCE using OCE-specific tooling, we believe that
the general principles, which are independent of OCE
and the application domain, are transferable to other
online learning systems.
REFERENCES
AI Redefined, Gottipati, S. K., Kurandwad, S., Mars,
C., Szriftgiser, G., and Chabot, F. (2021). Cog-
ment: Open Source Framework For Distributed Multi-
actor Training, Deployment & Operations. CoRR,
abs/2106.11345.
Berrar, D. (2019). Cross-validation. In Ranganathan, S.,
Gribskov, M., Nakai, K., and Sch
¨
onbach, C., editors,
Encyclopedia of Bioinformatics and Computational
Biology, pages 542–545. Academic Press, Oxford.
Biagiola, M. and Tonella, P. (2022). Testing the plasticity
of reinforcement learning-based systems. ACM Trans.
Softw. Eng. Methodol., 31(4).
Biagiola, M. and Tonella, P. (2024). Testing of deep re-
inforcement learning agents with surrogate models.
ACM Trans. Softw. Eng. Methodol., 33(3).
Braiek, H. B. and Khomh, F. (2020). On testing machine
learning programs. Journal of Systems and Software,
164:110542.
Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,
Schulman, J., Tang, J., and Zaremba, W. (2016). Ope-
nAI Gym. CoRR.
Cem Kaner, J. (2013). An introduction to scenario testing.
Florida Institute of Technology, Melbourne, pages 1–
13.
Chen, T. Y., Cheung, S. C., and Yiu, S. M. (2020). Meta-
morphic testing: a new approach for generating next
test cases. arXiv preprint arXiv:2002.12543.
De Angelis, E., De Angelis, G., and Proietti, M. (2023). A
classification study on testing and verification of ai-
based systems. In 2023 IEEE Int. Conf. On Artificial
Intelligence Testing (AITest), pages 1–8.
Dunne, R., Morris, T., and Harper, S. (2021). A survey
of ambient intelligence. ACM Computing Surveys
(CSUR), 54(4):1–27.
Giray, G. (2021). A software engineering perspective
on engineering machine learning systems: State of
the art and challenges. J. of Systems and Software,
180:111031.
Hussain, A., Nadeem, A., and Ikram, M. T. (2015). Re-
view on formalizing use cases and scenarios: Scenario
based testing. In 2015 Int. Conf. on Emerging Tech-
nologies (ICET), pages 1–6. IEEE.
Khomh, F., Adams, B., Cheng, J., Fokaefs, M., and Anto-
niol, G. (2018). Software engineering for machine-
learning applications: The road ahead. IEEE Soft-
ware, 35(5):81–84.
Mart
´
ınez-Fern
´
andez, S., Bogner, J., Franch, X., Oriol, M.,
Siebert, J., Trendowicz, A., Vollmer, A.-M., and Wag-
ner, S. (2022). Software engineering for AI-based sys-
tems: a survey. ACM Trans. on Software Engineering
and Methodology (TOSEM), 31(2):1–59.
Mazouni, Q., Spieker, H., Gotlieb, A., and Acher, M.
(2023). A review of validation and verification of
neural network-based policies for sequential decision
making. https://arxiv.org/abs/2312.09680.
Mazouni, Q., Spieker, H., Gotlieb, A., and Acher, M.
(2024). Testing for fault diversity in reinforcement
learning. page 136–146, New York, NY, USA. ACM.
Mitchell, T. (1997). Machine Learning. McGraw-Hill, New
York.
Murphy, C., Kaiser, G. E., and Arias, M. (2007). An ap-
proach to software testing of machine learning appli-
cations. In Int. Conf. on Software Engineering and
Knowledge Engineering.
Nakajima, S. (2017). Generalized Oracle for Testing Ma-
chine Learning Computer Programs. In Software En-
gineering and Formal Methods - SEFM 2017, volume
10729 of LNCS, pages 174–179. Springer.
OMG (2017). Unified Modeling Language, chapter 11.6.
https://www.omg.org/spec/UML/2.5.1/PDF.
Papis, B. and Wawrzy
´
nski, P. (2013). dotRL: A platform
for rapid Reinforcement Learning methods develop-
ment and validation. In 2013 Fed. Conf. on Computer
Science and Information Systems (FEDCSIS), pages
129–136. IEEE.
Riccio, V., Jahangirova, G., Stocco, A., Humbatova, N.,
Weiss, M., and Tonella, P. (2020). Testing machine
learning based systems: a systematic mapping. Em-
pirical Software Engineering, 25:5193–5254.
Russell, S. J. and Norvig, P. (2010). Artificial intelligence:
A Modern Approach. Pearson Education, Inc.
Sommerville, I. (2016). Component-based software engi-
neering. In Software Engineering, pages 464–489.
Pearson Education, 10
th
edition.
Sugali, K. (2021). Software testing: Issues and challenges
of artificial intelligence & machine learning. Int. J. of
Artificial Intelligence & Applications, 12(1):101–112.
Sutton, R. and Barto, A. (2018). Reinforcement Learning:
An Introduction. MIT Press, 2nd edition.
Ulbrich, S., Menzel, T., Reschka, A., Schuldt, F., and Mau-
rer, M. (2015). Defining and substantiating the terms
scene, situation, and scenario for automated driving.
In IEEE 18th Int. Conf. on intelligent transportation
systems, pages 982–988. IEEE.
Weyuker, E. J. (1982). On testing non-testable programs.
The Computer Journal, 25(4):465–470.
Younes, W., Trouilhet, S., Adreit, F., and Arcangeli, J.-
P. (2020). Agent-mediated application emergence
through reinforcement learning from user feedback.
In 29th IEEE Int. Conf. on Enabling Technologies: In-
frastructure for Collaborative Enterprises (WETICE),
pages 3–8. IEEE Press.
Zhang, J. M., Harman, M., Ma, L., and Liu, Y. (2020). Ma-
chine learning testing: Survey, landscapes and hori-
zons. IEEE Trans. on Software Eng., 48(1):1–36.
ICSOFT 2025 - 20th International Conference on Software Technologies
110