collaborative development of ML-based systems. We
propose a set of good practices for defining interfaces
of communication and technical designs for creating
autonomous pipelines that can help to streamline the
development process and improve the scalability and
maintainability of ML-based systems.
In order to address the challenges encountered
during the collaborative development of our ML-
based system, we structured our difficulties under a
series of challenges. We then designed solutions to
these challenges based on other state-of-the-art exper-
iments and practices that exist in the literature. Our
aim was to combine the knowledge gained from a sur-
vey of proposed practices with our own experiences,
to design more general and standardized practices
that could be applied successfully in future projects.
The solutions we proposed were based on careful
consideration of the unique needs and constraints of
our project, as well as the broader context of indus-
trial ML development. We believe that our approach
can help to improve the efficiency and effectiveness
of collaborative ML development efforts, while also
contributing to the development of more robust and
scalable ML-based systems.
ACKNOWLEDGEMENTS
This research was partially supported by DataSEER
project, financed through POC 2014-2020, Action
1.2.1, by European Commission and National Gov-
ernment of Romania (Project ID: 121004).
The authors express their gratitude to the indus-
trial partner, OPTIMA GROUP SRL, for their collab-
oration and valuable information exchange.
REFERENCES
Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Ka-
mar, E., Nagappan, N., Nushi, B., and Zimmermann,
T. (2019). Software engineering for machine learning:
A case study. In 2019 IEEE/ACM 41st International
Conference on Software Engineering: Software Engi-
neering in Practice (ICSE-SEIP), pages 291–300.
Ashmore, R., Calinescu, R., and Paterson, C. (2021). As-
suring the machine learning lifecycle: Desiderata,
methods, and challenges. ACM Computing Surveys
(CSUR), 54(5):1–39.
Beck, K. (2023). The agile manifesto. agile alliance.
http://agilemanifesto.org/. Accessed: Apr. 10, 2023.
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone,
C. J. (1984). Classification and regression trees
(wadsworth, belmont, ca). ISBN-13, pages 978–
0412048418.
Cerqueira, M., Silva, P., and Fernandes, S. (2022). Sys-
tematic literature review on the machine learning ap-
proach in software engineering. American Academic
Scientific Research Journal for Engineering, Technol-
ogy, and Sciences, 85(1):370–396.
Fair, J. (2012). Agile versus waterfall: approach is right for
my erp project? In Proceedings of Global Congress
2012—EMEA, Marsailles, France. Newtown Square,
PA: Project Management Institute.
Hopfield, J. (1982). Neural networks and physical systems
with emergent collective properties like those of two-
state neurons. Proc. Natl. Acad. Sci.(USA), 79:2554–
2558.
Kane, M. T. (2013). Validating the interpretations and uses
of test scores. Journal of Educational Measurement,
50(1):1–73.
Khomh, F., Adams, B., Cheng, J., Fokaefs, M., and Anto-
niol, G. (2018). Software engineering for machine-
learning applications: The road ahead. IEEE Soft-
ware, 35(5):81–84.
Lorenzoni, G., Alencar, P., Nascimento, N., and Cowan, D.
(2021). Machine learning model development from a
software engineering perspective: A systematic litera-
ture review. arXiv.
Makridakis, S. (2017). The forthcoming artificial intelli-
gence (ai) revolution: Its impact on society and firms.
Futures, 90:46–60.
Merkel, D. (2014). Docker: lightweight linux containers for
consistent development and deployment. Linux jour-
nal, 2014(239):2.
Microsoft (2022). Microsoft customer stories. Microsoft
Azure Blog.
Microsoft (2023). Azure machine learning.
https://azure.microsoft.com/en-us/services/machine-
learning/. Accessed: Apr. 10, 2023.
Serban, A., van der Blom, K., Hoos, H., and Visser, J.
(2020). Adoption and effects of software engineer-
ing best practices in machine learning. In Proceed-
ings of the 14th ACM / IEEE International Symposium
on Empirical Software Engineering and Measurement
(ESEM). ACM.
Swagger (2022). Swagger: The world’s most popular
framework for apis. https://swagger.io. Accessed:
Apr. 10, 2023.
Zhang, C. and Lu, Y. (2021). Study on artificial intelligence:
The state of the art and future prospects. Journal of
Industrial Information Integration, 23:100224.
Zhang, H., Stafman, L., Or, A., and Freedman, M. J. (2017).
Slaq: Quality-driven scheduling for distributed ma-
chine learning. In Proceedings of the 2017 Symposium
on Cloud Computing, SoCC ’17, page 390–404, New
York, NY, USA. Association for Computing Machin-
ery.
Towards Good Practices for Collaborative Development of ML-Based Systems
611