the principles of systems engineering, software
engineering, computer science and human-centred
design to create AI systems according to human needs
for mission outcomes’.
The use of software engineering techniques for
the construction of AI systems is based on the
principles established by Horneman (Horneman, et
al., 2019), who can be called the father of this new
area, considering that AI systems are software-
intensive systems, and therefore:
‘Established principles for designing and
deploying quality software systems that meet
their mission objectives on time should be
applied to engineering AI systems.’
‘Teams should strive to deliver functionality on
time and with quality, design for important
architectural requirements (such as security,
usability, reliability, performance and
scalability), and plan for system maintenance
throughout the life of the system.’
While ‘Artificial intelligence (AI) is pervasive in
today's landscape, there is still a lack of software
engineering expertise and best practices in this field’
(van Oort, et al., 2021), although developers of AI-
intensive systems are aware of the need to employ
these techniques, they also recognise that traditional
software engineering methods and tools are neither
adequate nor sufficient on their own and need to be
adapted and extended (Lavazza & Morasca, 2021).
For all these reasons, it is necessary to evolve the
methods for quality assurance of ‘software 2.0’
towards what can be called ‘MLware’ (Borg, 2021).
Although techniques or practices for quality
assurance of ML models or ML-based systems are
already beginning to emerge (Hamada, et al., 2020),
such as the use of white box and black box testing
(Gao, et al., 2019), it is necessary to ensure that the
tests performed on these AI systems follow
established quality standards and criteria:
performance, reliability, scalability, security, etc.
Therefore, quality assurance of AI-based systems is
an area that has not yet been well explored and
requires collaboration between the SE and AI
research communities and currently, there is a lack of
(standardised) approaches for quality assurance of
AI-based systems, being essential for their practical
use (Feldererer & Ramler, 2021).
Therefore, in the new area of Artificial
Intelligence Engineering, challenges have arisen
related to the definition and guarantee of the
behavioural and quality attributes of these systems
and applications, which have led us to work on the
development of an environment that allows the
evaluation of the functional suitability of AI systems.
Therefore, the main objective of this article is to
present the results obtained in the evaluation of the
functional suitability of an artificial intelligence
system using the built environment (Oviedo, et al.,
2024). In addition, the differences and difficulties
encountered in the evaluation of an AI system,
compared to the evaluation of a traditional software
system, are also presented. For all this, the rest of the
article is structured as follows: section 2 presents a
summary of the importance of quality in AI systems
and the proposals that exist for evaluating the quality
of these systems. Section 3 describes the evaluation
environment based on ISO/IEC 25000 family
standards used to carry out the evaluation. Section 4
presents the AI system evaluated, and the assessment
performed. Section 5 presents the conclusions
obtained with this work and the future lines of work
and research.
2 QUALITY IN AI SYSTEMS
For the quality assurance of artificial intelligence
systems, as is done in Software Engineering, it is
necessary to consider several dimensions (Borg,
2021), such as the quality of the data, the quality of
the AI system itself, the quality of the development
processes, etc. However, as mentioned above,
existing work on software quality needs to be adapted
to the particularities of artificial intelligence. This is
why in recent years new standards have emerged that
are adapted to the particularities of AI systems to
address their quality (Oviedo, et al., 2024) (Piattini,
2024).
At the level of AI system development processes,
the new ISO/IEC 5338 standard ‘Life cycle processes
for AI systems’ (ISO/IEC, 2023) has emerged, based
on ISO/IEC 12207, which lays the foundations for the
relevant processes that organisations should follow to
ensure the proper development of AI systems. This
standard defines 33 processes separated into four
groups, where it adapts the software lifecycle
processes to specific aspects of AI. To this end, 23 of
the 33 processes have been directly adapted from
ISO/IEC 12207 and three new processes have been
defined: Knowledge Acquisition Process, AI Data
Engineering Process and the Continuous Validation
Process (Márquez, et al., 2024).
On the other hand, at the level of AI system
product quality, several models, techniques and, in
some cases, tools have been proposed to assess and
ensure the quality of AI systems. However, these
works are based on software engineering standards