AI Engineering for Trust by Design

Andr

e Meyer-Vitali

Deutsches Forschungszentrum f

ur K

unstliche Intelligenz GmbH (DFKI),

Stuhlsatzenhausweg 3, Saarland Informatics Campus D32, 66123 Saarbrucken, Germany

Keywords:

Software Engineering, Artiﬁcial Intelligence, Causality, Trust, Robustness, Explainability.

Abstract:

The engineering of reliable and trustworthy AI systems needs to mature. While facing unprecedented chal-

lenges, there is much to be learned from other engineering disciplines. We focus on the four pillars of (i)

Models & Explanations, (ii) Causality & Grounding, (iii) Modularity & Compositionality, and (iv) Human

Agency & Oversight. Based on these pillars, a new AI engineering disciple could emerge, which we aim to

support using corresponding methods and tools for “Trust by Design”.

1 INTRODUCTION

The current wave of Artiﬁcial Intelligence (AI) has

emerged as a leading technology in the digital trans-

formation, changing the economy, society, and our

lives, while attracting massive investment world-

wide. The past decade has been characterised by

Deep Learning (LeCun et al., 2015; Deng and Yu,

2014), Transformers (Vaswani et al., 2017; Vaswani

et al., 2023) and Large ”Foundation” Models Machine

learning methods have transformed AI from a niche

science to a socially relevant “mega-technology,” es-

pecially in the ﬁelds of image and video analysis,

as well as in text and language processing. This

new technology is made possible primarily by the

latest graphics processors and the availability of

vast amounts of data from social media and similar

sources.

However, we are reaching the limits of control

over these large, highly interconnected, AI-based sys-

tems. The complexity of existing AI models is often

beyond our understanding, and the methods and pro-

cesses to ensure safety, reliability, and transparency

are lacking. We must overcome these novel and seri-

ous limitations or face an inevitable dwindling public

and consumer acceptance of AI and dramatic losses

in business opportunities and markets. This is clearly

visible already in the automotive sector‘s broad re-

treat from highly automated driving. AI-based tech-

nology is also a key enabler in other economic sectors

– including healthcare, mobility, energy, and the dig-

ital industry itself. All of these markets depend on

https://orcid.org/0000-0002-5242-1443

complex and highly connected AI systems designed

to support people in decision making and situational

analysis.

Despite all the successes, many are not aware that

deep learning does not support a real understanding of

the problem, but only reﬂects complex statistical rela-

tionships. Great disillusionment set in as problems

such as insufﬁcient internal representation of mean-

ing (interpretability and transparency), susceptibility

to changes in the input signal (robustness), lack of

transferability to cases not covered by the data (gener-

alisation) and, last but not least, the thirst for big data

itself (efﬁciency, adequacy, sustainability) became ap-

parent.

Recently, however, a new overall approach to

solving these problems is being advanced by the term

“Trusted AI.” Trusted AI aims to create a new gener-

ation of AI systems that guarantee functionality, al-

lowing use even in critical applications. Develop-

ers, domain experts, users, and regulators can rely on

performance and reliability even for complex socio-

technical systems. Trusted AI is characterised by

a high degree of robustness, transparency, fairness,

and veriﬁability, where the functionality of existing

systems is in no way compromised, but actually en-

hanced.

2 MOTIVATION

Current machine learning systems perform quite well

and reliably in the context of their training data sets.

To be useful, however, they also need to predict, clas-

Meyer-Vitali, A.

AI Engineering for Trust by Design.

DOI: 10.5220/0012622400003645

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 12th International Conference on Model-Based Software and Systems Engineering (MODELSWARD 2024), pages 357-364

ISBN: 978-989-758-682-8; ISSN: 2184-4348

357

sify, decide and act in situations that they were not

explicitly trained for. Therefore, they are evaluated

with test data sets that should not overlap with the

training data set. The measured level of ”generalisa-

tion” is an indication for how well they can perform

in general (at least with respect to the test data, which

is a limiting factor, indeed). Even a good level of such

generalisation is not sufﬁcient, however, because the

systems are not able to distinguish between ”normal”

and ”abnormal” situations.

Robustness is the ability of software systems to

react appropriately to abnormal conditions (Meyer,

1997). For this purpose, it is a necessity to recognise

situations or contexts where (implicit) assumptions do

no longer hold. Without making those assumptions

explicit, it is impossible to detect the edge of compe-

tence and to adapt accordingly. Explicit assumptions

or (world) models include rules, norms or laws. These

models include physical and natural laws (thermo-

dynamics, electromagnetism, gravity, quantum me-

chanics, etc.), legal rules, socio-cultural norms, med-

ical models (anatomy, mechanistic models of dis-

ease transmission, etc.), and others – that are always

true, independent of the training data. Adaptation to

changes in or of the context may include changes of

rules. The use of model-based software engineering

allows to exclude impossible or to invalidate highly

improbable options and to enforce or guide learning

and adaptation towards the most plausible and realis-

tic outcomes. Many combinations of methods using

knowledge-based reasoning models and data-driven

learning components are possible (van Bekkum et al.,

2021) and contribute to mutual system-level enhance-

ments.

3 BUILDING TRUSTWORTHY AI

SYSTEMS FOR THE FUTURE

Some of the current problems related to a lack of

trust in AI systems are a direct result of the mas-

sive use of black-box methods that depend solely on

data (Morocho-Cayamcela et al., 2019). Instead, the

new AI generation has its foundation built on hybrid

AI systems (also known as neuro-symbolic or neuro-

explicit). These hybrids do not rely solely on data-

driven approaches but on the full range of AI tech-

nologies (“All of AI”), which includes symbolic AI

methods, search, reasoning, planning, and other op-

erations. “Trust by Design” is achieved through the

combination of Machine Learning with symbolic con-

clusions and the explicit representation of knowledge

in hybrid AI systems. Knowledge no longer needs to

be machine learned when it is represented by seman-

tic and other explicit models, which can also guide

the learning process in a direction that improves gen-

eralisation, robustness, and interpretability. This hy-

brid approach is also known as the third wave of AI

(Garcez et al., 2009; Garcez and Lamb, 2023). The

requirements are particularly strict when it comes to

applications with signiﬁcant physical, economic, or

social risk. The AI systems used in such applications

are required – for example by the European AI Act –

to have been validated and certiﬁed.

With respect to the recent excitement about gener-

ative AI, a few critical considerations need to be high-

lighted. Generative AI is based on so-called ”Foun-

dation Models”, which can appear as Large Language

Models (LLM) or as similar models of still images

or videos. The transformer architectures that generate

these models convert huge amounts or text or other

media content into statistical models of co-occurrence

of tokens (parts of words or other features). The re-

sulting models can then be used to generate text, im-

ages and video as predictions of probabilistic patterns

of adjacency in the model’s huge space. For text, it

is also possible to extract summaries and to conduct

dialogues in natural language. At a ﬁrst glance, these

models for generative AI seem to understand human

language and creative expression. However, as they

are uniquely based on producing probabilistic assem-

blies of tokens, they do not even even language it-

self. There is no grammar involved or any form of

semantics. Foundational Models are not trustworthy,

because they lack any kind of understanding of truth,

facts, time, space, concepts, reasons, causes and ef-

fects. As they are not consistent, transparent, robust

and reliable, it is very risky to trust them in critical

applications. Even when they seem to give reasonable

answers from time to time, it is impossible to predict

when they will fail and start to hallucinate.

3.1 Trusted AI Engineering

There is a dilemma to overcome in building trustwor-

thy AI systems (Thiebes et al., 2021; Ramchurn et al.,

2021): on the one hand, we expect AI systems to

decide autonomously and intelligently on our behalf,

which requires agency and delegation; on the other

hand, we require them to be predictable, veriﬁable,

safe and accountable. Of course, there are limits to

achieving all these goals and to guarantee correctness

under all circumstances and domains. Instead, there

is a trade-off to be made between entirely predictable

and correct versus plausible and adaptive behaviour.

What matters most is that expectations are managed

to create validated trust through experience.

When designing trustworthy AI systems, there are

MBSE-AI Integration 2024 - Workshop on Model-based System Engineering and Artiﬁcial Intelligence

358

several important aspects that should be considered

to guarantee the characteristics of trustworthy AI. In

principle, these aspects apply to all software systems.

However, they are of the greatest relevance for com-

plex, intelligent systems for critical applications. AI

engineering should make use of the lessons learned

from software engineering and apply its engineering

principles, such as design patterns and architectures.

A fundamental difference between tradition soft-

ware and AI systems is that the outcomes are not

necessarily deterministic, but probabilistic, and that

there may be more than one ”correct answer”. Hence,

the goal is shifting from guarantees of correctness to-

wards verifying for plausibility.

Very importantly, an autonomous AI system

should be aware of its level of competence and self-

conﬁdence of its results. The area of competence is

also known as the operational design domain (ODD),

but a system may perform well beyond its designed or

trained expertise by generalising also out of domain

(OOD).

The following four pillars of AI Engineering are

proposed as a framework for creating Trusted AI by

Design. Each of these pillars will be described in

more detail in the sections below, with a special fo-

cus on causality.

Models & Explanations. Reliable predictions about

system behaviour for insightful and plausible ex-

planations and simulations with generalised mod-

els from knowledge and training.

Causality & Grounding. Identiﬁcation and predic-

tions of cause-effect relationships for informed

predictions and anchoring of meaning in real-

world context and phenomena.

Modularity & Compositionality. Design of com-

plex systems broken down into comprehensible

and manageable parts (functions and features), re-

liably composed in system architectures.

Human Agency & Oversight. Overview, ﬁnal deci-

sion and responsibility by humans for actions of

AI systems, also when delegating tasks to au-

tonomous agents in collaborative teams.

3.2 Models and Explanations

Explicit models

of the world or a suitable context in

question enable reliable predictions of the behaviour

of AI systems, both in the scope of training data

The term “model” is used extensively in the ML com-

munity. It is necessary, however, to distinguish between the

statistical models of ML and the semantic models of knowl-

edge engineering. Here, we refer to the latter. See also in

(van Bekkum et al., 2021) for a uniﬁed taxonomy of AI.

and outside, because they generalise knowledge be-

yond the limited and biased scope of the training data.

Given a certain context, which can be very narrow or

broad, explicit models represent concepts, relation-

ships and rules that are always true in that context.

For example, the laws of gravity are applicable to the

whole universe. Models can be created by experts or

learned from experience and data. Combinations of

different types of models are particularly useful and

insightful. For example, neuro-symbolic approaches

are used to achieve this (Garcez et al., 2002b; Garcez

et al., 2002a; Bader and Hitzler, 2005; Lake et al.,

2017; Yu et al., 2021). In this way, models pro-

mote transparency and explainability and, thus, make

it possible to render the behaviour of the AI systems

understandable and plausible. In simulations, models

can enable the understanding – through experiments

– of situations that are difﬁcult or impossible to ac-

cess otherwise. Privacy is thus maintained, as is the

avoidance of dangerous conditions.

Because models depend on a given context or do-

main, it is essential that agents using those models are

aware of their competence in the given situation and

are able to apply suitable models or adapt to situations

gracefully when changing or leaving their scope of

competence. Each context includes a corresponding

bias. Often, bias is attempted to be removed. How-

ever, agents need to be aware of their bias and to apply

it thoughtfully, because bias is a measure of informa-

tion, when bias-awareness exists.

3.3 Causality and Grounding

The need to move from correlation to causation is be-

coming more and more evident. If we want to ex-

plain why certain predictions are made or decisions

are taken, it is essential to know their causes (Pearl

et al., 2016; Pearl and Mackenzie, 2018). Causality

refers to the ability to identify and predict cause-and-

effect relationships, i.e. which effects are the results

of which causes and why. An AI system that can un-

derstand causal relationships is able to make informed

predictions and solve complex problems. Counterfac-

tual inference can be performed in a wider scope of

domain than given by the training data alone, because

answers can be found to questions that involve hy-

potheses about changes in the past (”what would have

happened if...?”), which give reasons for alternative

outcomes in the future.

Causal Models. Structural causal models (SCM)

and Structural Equation Models (SEM) (Pearl,

2010; Pearl et al., 2016) provide a concise method

for modelling and analysing causal relationships

as graphs (SCM) and sets of equations (SEM),

AI Engineering for Trust by Design

359

as shown in ﬁgure 1. SCMs support the human

understanding and explanations, while SEMs are

more suitable for representing causality in combi-

nation with logical expressions to deﬁne the spe-

ciﬁc functions that relate the variables (F). Vari-

ables represent events, processes, states or ob-

jects.

Causal Inference. Causal inference is typically con-

cerned with the resulting effect when a corre-

sponding event (cause) occurs, according to a

given causal model, such that the respective de-

pendency can be veriﬁed. Causal inference asks

whether an event indeed causes a certain effect

by determining the likelihood that one event was

the cause of another. In contrast to statistical

correlations, causal relationships are asymmetri-

cal (Price, 1992; Kutach, 2013; Ismael, 2023), i.e.

that there is a directed relationship from a cause

to an effect, rather than a spurious co-occurrence

of events.

Counterfactuals refer to alternative choices that

could have been made in the past and the cor-

responding effects that they might have caused.

Therefore, they allow for exploring possibilities

that exist in imagined worlds – in contrast to what

actually happened – by intervening in the value

of speciﬁc variables and, hence, ﬁnd alternative

outcomes according to the same model as usual.

Counterfactuals can represent situations that may

not be practical for observation due to practical or

ethical reasons, which enables the causal explo-

ration of a wider scope of domain.

Causal Discovery. Even when causal models are not

known in advance, causal discovery allows for de-

termining whether a change in one variable (rep-

resenting a state, action or event) indeed causes

a change in another, in order to distinguish be-

tween correlated and causal relationships in data

and to derive corresponding models. Approaches

to make the distinction are interventions, random

control trials and counterfactual reasoning (Eber-

hardt, 2017; Zhu et al., 2020; Sch

olkopf et al.,

2021; Sch

olkopf and von K

ugelgen, 2022). The

use of known causal models can improve (lan-

guage) understanding and causal discovery can

bring understanding from data to a higher level,

i.e., formulate new hypotheses and insights that

transcend the previous body of knowledge (for ex-

ample, in a similar way as discovering the laws of

thermodynamics or electromagnetism).

Causal Machine Learning. Causally-informed Ma-

chine Learning (CML) uses causal models to in-

ﬂuence and direct ML methods for improved pre-

X Y

(a) Structural Causal Model (SCM).

U = {X, Y}

V = {Z}

F = { f

: Z = 2X + 3Y }

(b) Structural Equation Model (SEM).

Figure 1: Causal Models.

dictability through reuse of domain knowledge, as

well as explainability and robustness through in-

terventions and counterfactuals (Vlontzos et al.,

2023; Kyono and van der Schaar, 2019; Zhang

et al., 2020; Rawal et al., 2023).

Shared causal models increase trust among team

members, because they help to explain to each other

why certain actions are to be taken (Janssen et al.,

2022). Delegation without reason or motivation is not

trustworthy (unless the authority or reputation of the

delegator is very high). This enables users to better

understand the rationale and have greater conﬁdence

in others making a fair decision. Causality can also

be seen as an enabler (or even a requirement) for ex-

plainable artiﬁcial intelligence (Carloni et al., 2023).

There are several important aspects by which

causality can improve the trustworthiness of AI

systems. Besides precision and accuracy, which

are fundamental to trustworthiness in AI, they are

(Greifeneder, 2021; Ganguly et al., 2023; Bartling

et al., 2018; Yap and Tomlinson, 2016):

Transparency & Interpretability. The reasoning

behind decisions is explainable and easily un-

derstood by humans. Causal models provide the

reasons for predictions and causal explanations

help to build a correct mental model of the

problem.

Reproducibility. The ability to repeat experiments

and get the same results increases the trustwor-

thiness and accuracy of scientiﬁc output.

Fairness. Causal AI can remove bias, because it un-

derstands how variables are interconnected and

dependent on each other. Understanding causal

relationships between sensitive input variables

(such as gender or race) and predicted outcomes is

important for assessing biased behaviour. Coun-

terfactual fairness is achieved when the output is

identical for each sensitive input variable.

Robustness. Causal models can avoid the brittleness

MBSE-AI Integration 2024 - Workshop on Model-based System Engineering and Artiﬁcial Intelligence

360

of most machine learning systems, due to spurious

correlations. They can handle data that is not in-

dependent and identically distributed (IID) or out

of distribution (OOD), because they can discern

between relevant and irrelevant data and variables

(Sherman and Shpitser, 2019; Zhang et al., 2023).

Privacy. The robustness of causal models helps in

preventing privacy attacks, because weaknesses of

trained models cannot easily be exploited, for ex-

ample in federated learning.

Safety & Accountability (Auditing). Regulations

for safe-guarding AI systems for use in critical

applications and domains demand impact as-

sessment (IA) to prevent from algorithmic and

data-driven harm by ﬁnding potential negative

effects before (large-scale) deployment. Causal

models that represent dependencies between sys-

tem design and impact can be used to assess and

mitigate corresponding risks by identifying which

system elements are responsible for undesired

effects.

Closely related to causality is understanding the an-

choring (grounding) of meanings in the real context.

A deep understanding of context and meaning re-

quires not only processing data, but also capturing the

real-world phenomena that the data represents, such

that predictions, decisions and actions are based on

them. This applies also to large language models, so

that statements are not only made based on statistical

probabilities, but in the knowledge of the concepts,

contexts, phenomena, and semantic and causal rela-

tionships grounded in reality (Searle, 1980; Harnad,

1990). Whether this knowledge necessarily requires

physical interaction of the agent with its environment

remains a subject of debate (G

ardenfors, 2019). Har-

nad argues for the need for sensations to induce and

stimulate representations via distal objects – things

that exist in the environment and emit signals that

can be perceived by means of a medium and means

of perception. Not all concepts are physical though,

which is a strength of abstract thinking, namely that

more abstract concepts can be formed, represented

and communicated from less abstract ones. Also, per-

ception is guided by intellect and constrained by the

available means of perception.

Layers of abstractions are fundamental for build-

ing rich architectures in software engineering and AI

systems are no exception. Semantic models, such

as ontologies (Fensel et al., 2001; Antoniou and

Van Harmelen, 2004), are representations of con-

cepts, their attributes and relationships, and, there-

fore, contribute to trustworthy AI systems by explain-

ing and constraining the meaning of those concepts.

The difference and close interaction between percep-

tion and reasoning on various levels of abstractions is

documented as System 1 and System 2 in (Kahneman,

2011).

3.4 Modularity and Compositionality

One of the fundamental design principles of (soft-

ware) engineering is modularity. Modularity guar-

antees that complex systems are broken down into

understandable and manageable parts (functions and

features) and reliably assembled into system architec-

tures. This increases the reliability of the individ-

ual components and their assemblies as systems of

systems. It is much easier to verify smaller compo-

nents than big monolithic artefacts. The evolution in

software engineering from structured to modular and

object-oriented programming enabled the design and

construction of complex systems. In well-designed

systems the transitions between successive compo-

nents can be controlled and protected, making them

explainable such that errors can be detected effec-

tively. The pre- and post-conditions of each compo-

nent can be validated and orchestrated in increasingly

complex systems of systems.

An important advantage of modular systems is

that compositional patterns of subsystems can be

identiﬁed and deﬁned, which increases their reliabil-

ity and documentation through reuse (Gamma et al.,

1994; van Bekkum et al., 2021).

It is important to stress that software architectures

are not merely static artefacts, but they rely on the

interplay between structures and events – the organis-

ing principles and the dynamic evolution of complex

systems (L

evi-Strauss, 1962). Neither structure nor

events are meaningful on their own, but require and

depend on each other. In an extrapolated view, this re-

lationship may be applied to the combination of learn-

ing and reasoning. Meaning emerges from a system’s

structure and its components, when it is operated in a

dynamic context of perceiving and acting.

The principle of compositionality also applies

to knowledge models and languages (Tiddi et al.,

2023): larger constructs are created by joining to-

gether smaller units with speciﬁc, understandable,

and veriﬁable tasks. Abstract relationships can thus

be traced back to their components. These aspects are

applied when designing complex systems and should

also become a matter of course for AI systems.

3.5 Human Agency and Oversight

Human agency and oversight mean that in any case

a human should have the overview, ﬁnal decision,

AI Engineering for Trust by Design

361

and responsibility for the actions of an AI system

(human empowerment). Even if many tasks are in-

creasingly being transferred to autonomous AI sys-

tems (agents), the principle that humans supervise,

assess, and approve actions still applies. Keeping in

mind the above-mentioned dilemma in building trust-

worthy AI systems, delegation of tasks needs to be

interpretable by both humans and (software) agents –

in particular, when humans and agents collaborate as

hybrid teams in a symbiotic partnership. It is neces-

sary that suitable task descriptions are handed over to

the agents and that they understand and execute them

in the relevant context, considering the models, expla-

nations and causal relationships explained above.

For collaborative decision-making (CDM), it is

essential that each human and agent is aware of each

others’ points of view and understands that others

possess mental states that might differ from one’s own

- which is known as a Theory of Mind (ToM). ToM is

deﬁned as the human cognitive ability to perceive and

interpret others in terms of their mental states, such as

beliefs, desires, goals, intentions and emotions, and

it is considered an indispensable requirement of hu-

man social life (Premack and Woodruff, 1978; Baron-

Cohen et al., 1985; Frith and Frith, 2005; Verbrugge

and Mol, 2008; Byom and Mutlu, 2013; Buehler and

Weisswange, 2020). Rather than reasoning only with

one’s own beliefs, desires, intentions, emotions, and

thoughts, a person or agent with the awareness of oth-

ers’ states of mind can consider different and mindful

acts, depending on a perceived context. This ability

allows them to more easily understand, predict, and

even manipulate the behaviour of others (Verbrugge,

2020).

When considering the collaboration and compe-

tition in hybrid teams of humans and autonomous

agents, we consider many-to-many situations where

multiple humans and multiple agents form hybrid

teams. The purpose of the agents is to empower hu-

mans with providing their complementary capabili-

ties, such as fast and precise information exchange

and analysis of large data sets. Agents can play many

different roles, but the responsibility for decisions re-

mains, in principle, with humans, for example by ver-

ifying, validating and approving proposals for deci-

sions. An essential aspect of meaningful collabora-

tion is to make mutual assumptions and expectations

explicit, such that they can be used in deliberation and

communication. This is a prerequisite for appropriate

delegation of tasks and the accurate and concise de-

scriptions of their underlying intentions.

Instead of relying on AI systems to take over hu-

man activities, as some have predicted, it is better to

focus on how humans and machines can complement

each other’s strengths (Marcus, 2022). For example,

radiologists are still needed to interpret MRI images

(Chan and Siegel, 2019), but they will have to col-

laborate with AI systems and those systems need to

support the human collaborators by providing insight

into their decision-making process. Therefore, a new

approach of hybrid or neuro-symbolic AI is neces-

sary for creating trustworthiness (Marcus and Davis,

2019).

Trustworthiness in interacting with artiﬁcially in-

telligent systems emerges from experience and as a

combination of various properties, such as fairness,

robustness, transparency, veriﬁcation, and accuracy

(Harbers et al., 2008). AI systems are trusted when

we have conﬁdence in the decisions that they take,

i.e. when we understand why they are made (Rudin,

2019), even when we disagree.

In a community with trustworthy interactions, it is

crucial to establish and enforce social norms (Haynes

et al., 2017; Emelin et al., 2020; Jiang et al., 2021;

Savarimuthu et al., 2008; Haynes et al., 2017). Such

norms can be of generic nature or valid only within

certain communities or teams and specify transpar-

ently what is expected behaviour, what is allowed or

forbidden and which are the consequences in case of

violations. In addition, knowledge and intentions, but

also norms, can change and need to be adapted in due

course. Otherwise, such systems and interactions can-

not be trusted any longer.

4 CONCLUSIONS

As the ﬁeld of Artiﬁcial Intelligence is still, and again,

facing tremendous and overwhelming changes and

progress, there is a strong and quickly growing need

for trust in AI systems. The goal of Trust by De-

sign is proposed to be based on the four engineering

principles of (i) Models & Explanations, (ii) Causal-

ity & Grounding, (iii) Modularity & Compositional-

ity, and (iv) Human Agency & Oversight. Our in-

tention is to develop the insights above further into

practical methods and tools to beneﬁt the AI com-

munity and its users. The Boxology in (van Bekkum

et al., 2021) provides a stepping stone to further de-

velop trustworthy AI engineering methods, based on

neuro-symbolic and causal AI.

ACKNOWLEDGEMENTS

This research was partially supported by TAILOR, a

project funded by EU Horizon 2020 research and in-

novation programme under GA No 952215.

MBSE-AI Integration 2024 - Workshop on Model-based System Engineering and Artiﬁcial Intelligence

362

REFERENCES

Antoniou, G. and Van Harmelen, F. (2004). A semantic web

primer. MIT press.

Bader, S. and Hitzler, P. (2005). Dimensions of Neural-

symbolic Integration - A Structured Survey.

Baron-Cohen, S., Leslie, A. M., and Frith, U. (1985). Does

the autistic child have a “theory of mind” ? Cognition,

21(1):37–46.

Bartling, B., Fehr, E., Huffman, D., and Netzer, N. (2018).

The Causal Effect of Trust.

Buehler, M. C. and Weisswange, T. H. (2020). Theory of

Mind based Communication for Human Agent Coop-

eration. In 2020 IEEE International Conference on

Human-Machine Systems (ICHMS), pages 1–6.

Byom, L. and Mutlu, B. (2013). Theory of mind: mecha-

nisms, methods, and new directions. Frontiers in Hu-

man Neuroscience, 7.

Carloni, G., Berti, A., and Colantonio, S. (2023). The

role of causality in explainable artiﬁcial intelligence.

arXiv:2309.09901 [cs].

Chan, S. and Siegel, E. L. (2019). Will machine learn-

ing end the viability of radiology as a thriving med-

ical specialty? The British Journal of Radiology,

92(1094):20180416.

Deng, L. and Yu, D. (2014). Deep Learning: Methods

and Applications. Foundations and Trends® in Signal

Processing, 7(3–4):197–387. Publisher: Now Pub-

lishers, Inc.

Eberhardt, F. (2017). Introduction to the foundations of

causal discovery. International Journal of Data Sci-

ence and Analytics, 3(2):81–91.

Emelin, D., Bras, R. L., Hwang, J. D., Forbes, M., and

Choi, Y. (2020). Moral Stories: Situated Reason-

ing about Norms, Intents, Actions, and their Conse-

quences. arXiv:2012.15738 [cs]. arXiv: 2012.15738.

Fensel, D., van Harmelen, F., Horrocks, I., McGuinness, D.,

and Patel-Schneider, P. (2001). OIL: an ontology in-

frastructure for the Semantic Web. IEEE Intelligent

Systems, 16(2):38–45. Conference Name: IEEE Intel-

ligent Systems.

Frith, C. and Frith, U. (2005). Theory of mind. Current

Biology, 15(17):R644–R645. Publisher: Elsevier.

Gamma, E., Helm, R., Johnson, R., Vlissides, J., and

Booch, G. (1994). Design Patterns: Elements of

Reusable Object-Oriented Software. Addison-Wesley

Professional, Reading, Mass, 1st edition edition.

Ganguly, N., Fazlija, D., Badar, M., Fisichella, M., Sikdar,

S., Schrader, J., Wallat, J., Rudra, K., Koubarakis, M.,

Patro, G. K., Amri, W. Z. E., and Nejdl, W. (2023). A

Review of the Role of Causality in Developing Trust-

worthy AI Systems. arXiv:2302.06975 [cs].

Garcez, A. d., Broda, K. B., and Gabbay, D. M. (2002a).

Neural-Symbolic Integration: The Road Ahead. In

Garcez, A. d., Broda, K. B., and Gabbay, D. M.,

editors, Neural-Symbolic Learning Systems: Founda-

tions and Applications, Perspectives in Neural Com-

puting, pages 235–252. Springer, London.

Garcez, A. d., Lamb, L., and Gabbay, D. (2009). Neural-

Symbolic Cognitive Reasoning. Springer, Berlin, Hei-

delberg.

Garcez, A. d. and Lamb, L. C. (2023). Neurosymbolic

AI: the 3rd wave. Artiﬁcial Intelligence Review,

56(11):12387–12406.

Garcez, A. S. d., Gabbay, D. M., and Broda, K. B. (2002b).

Neural-Symbolic Learning System: Foundations and

Applications. Springer-Verlag, Berlin, Heidelberg.

ardenfors, P. (2019). From Sensations to Concepts: a Pro-

posal for Two Learning Processes. Review of Philos-

ophy and Psychology, 10(3):441–464.

Greifeneder, B. (2021). Three Ways A Causal Approach

Can Improve Trust In AI. Section: Innovation.

Harbers, M., Verbrugge, R., Sierra, C., and Debenham, J.

(2008). The Examination of an Information-Based

Approach to Trust. In Sichman, J. S., Padget, J., Os-

sowski, S., and Noriega, P., editors, Coordination, Or-

ganizations, Institutions, and Norms in Agent Systems

III, Lecture Notes in Computer Science, pages 71–82,

Berlin, Heidelberg. Springer.

Harnad, S. (1990). The symbol grounding problem. Physica

D: Nonlinear Phenomena, 42(1):335–346.

Haynes, C., Luck, M., McBurney, P., Mahmoud, S., V

ıtek,

T., and Miles, S. (2017). Engineering the emergence

of norms: a review. The Knowledge Engineering Re-

view, 32. Publisher: Cambridge University Press.

Ismael, J. (2023). Reﬂections on the asymmetry of cau-

sation. Interface Focus, 13(3):20220081. Publisher:

Royal Society.

Janssen, S., Sharpanskykh, A., and Mohammadi Ziabari,

S. S. (2022). Using Causal Discovery to Design

Agent-Based Models. In Van Dam, K. H. and Ver-

staevel, N., editors, Multi-Agent-Based Simulation

XXII, Lecture Notes in Computer Science, pages 15–

28, Cham. Springer International Publishing.

Jiang, L., Hwang, J. D., Bhagavatula, C., Bras, R. L.,

Forbes, M., Borchardt, J., Liang, J., Etzioni, O., Sap,

M., and Choi, Y. (2021). Delphi: Towards Machine

Ethics and Norms. arXiv:2110.07574 [cs]. arXiv:

2110.07574.

Kahneman, D. (2011). Thinking, Fast and Slow. Farrar,

Straus and Giroux, New York, 1st edition edition.

Kutach, D. (2013). Causal Asymmetry. In Kutach, D., ed-

itor, Causation and its Basis in Fundamental Physics,

page 0. Oxford University Press.

Kyono, T. and van der Schaar, M. (2019). Improv-

ing Model Robustness Using Causal Knowledge.

arXiv:1911.12441 [cs, stat].

Lake, B. M., Ullman, T. D., Tenenbaum, J. B., and Ger-

shman, S. J. (2017). Building machines that learn

and think like people. Behavioral and Brain Sciences,

40:e253. Publisher: Cambridge University Press.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learn-

ing. Nature, 521(7553):436–444. Number: 7553 Pub-

lisher: Nature Publishing Group.

evi-Strauss, C. (1962). La pens

ee sauvage. Plon. Google-

Books-ID: OoEeAAAAIAAJ.

Marcus, G. (2022). Deep Learning Is Hitting a Wall.

AI Engineering for Trust by Design

363

Marcus, G. and Davis, E. (2019). Rebooting AI: Building

Artiﬁcial Intelligence We Can Trust. Vintage.

Meyer, B. (1997). Object-Oriented Software Construction.

Prentice Hall, Upper Saddle River, NJ, 2 edition.

Morocho-Cayamcela, M. E., Lee, H., and Lim, W. (2019).

Machine Learning for 5G/B5G Mobile and Wireless

Communications: Potential, Limitations, and Future

Directions. IEEE Access, 7:137184–137206. Confer-

ence Name: IEEE Access.

Pearl, J. (2010). An Introduction to Causal Inference. The

International Journal of Biostatistics, 6(2). Publisher:

De Gruyter.

Pearl, J., Glymour, M., and Jewell, N. P. (2016). Causal

Inference in Statistics: A Primer. John Wiley & Sons.

Google-Books-ID: I0V2CwAAQBAJ.

Pearl, J. and Mackenzie, D. (2018). The Book of Why: The

New Science of Cause and Effect. Basic Books, New

York, 1st edition edition.

Premack, D. and Woodruff, G. (1978). Does the chim-

panzee have a theory of mind? Behavioral and Brain

Sciences, 1(4):515–526. Publisher: Cambridge Uni-

versity Press.

Price, H. (1992). Agency and Causal Asymmetry. Mind,

101(403):501–520. Publisher: [Oxford University

Press, Mind Association].

Ramchurn, S. D., Stein, S., and Jennings, N. R.

(2021). Trustworthy human-AI partnerships. iScience,

24(8):102891.

Rawal, A., Raglin, A., Sadler, B. M., and Rawat, D. B.

(2023). Explainability and causality for robust, fair,

and trustworthy artiﬁcial reasoning. In Artiﬁcial Intel-

ligence and Machine Learning for Multi-Domain Op-

erations Applications V, volume 12538, pages 493–

500. SPIE.

Rudin, C. (2019). Stop Explaining Black Box Machine

Learning Models for High Stakes Decisions and Use

Interpretable Models Instead. arXiv:1811.10154 [cs,

stat]. arXiv: 1811.10154.

Savarimuthu, B. T. R., Craneﬁeld, S., Purvis, M., and

Purvis, M. (2008). Role Model Based Mechanism

for Norm Emergence in Artiﬁcial Agent Societies. In

Sichman, J. S., Padget, J., Ossowski, S., and Nor-

iega, P., editors, Coordination, Organizations, Institu-

tions, and Norms in Agent Systems III, Lecture Notes

in Computer Science, pages 203–217, Berlin, Heidel-

berg. Springer.

Sch

olkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalch-

brenner, N., Goyal, A., and Bengio, Y. (2021). To-

ward Causal Representation Learning. Proceedings of

the IEEE, 109(5):612–634. Conference Name: Pro-

ceedings of the IEEE.

Sch

olkopf, B. and von K

ugelgen, J. (2022). From Statistical

to Causal Learning. arXiv:2204.00607 [cs, stat].

Searle, J. R. (1980). Minds, brains, and programs. Behav-

ioral and Brain Sciences, 3(3):417–424. Publisher:

Cambridge University Press.

Sherman, E. and Shpitser, I. (2019). Identiﬁcation and

Estimation of Causal Effects from Dependent Data.

arXiv:1902.01443 [stat].

Thiebes, S., Lins, S., and Sunyaev, A. (2021). Trustworthy

artiﬁcial intelligence. Electronic Markets, 31(2):447–

464.

Tiddi, I., De Boer, V., Schlobach, S., and Meyer-Vitali, A.

(2023). Knowledge Engineering for Hybrid Intelli-

gence. In Proceedings of the 12th Knowledge Cap-

ture Conference 2023, K-CAP ’23, pages 75–82, New

York, NY, USA. Association for Computing Machin-

ery.

van Bekkum, M., de Boer, M., van Harmelen, F., Meyer-

Vitali, A., and Teije, A. t. (2021). Modular design pat-

terns for hybrid learning and reasoning systems. Ap-

plied Intelligence, 51(9):6528–6546.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, L., and Polosukhin, I.

(2017). Attention is All you Need. In Advances in

Neural Information Processing Systems, volume 30.

Curran Associates, Inc.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, L., and Polosukhin, I.

(2023). Attention Is All You Need. arXiv:1706.03762

[cs].

Verbrugge, R. (2020). Testing and Training Theory of

Mind for Hybrid Human-agent Environments. In

Rocha, A. P., Steels, L., and Herik, H. J. v. d., edi-

tors, Proceedings of the 12th International Conference

on Agents and Artiﬁcial Intelligence, ICAART 2020,

Volume 1, Valletta, Malta, February 22-24, 2020,

page 11. SCITEPRESS.

Verbrugge, R. and Mol, L. (2008). Learning to Apply The-

ory of Mind. Journal of Logic, Language and Infor-

mation, 17(4):489–511.

Vlontzos, A., Kainz, B., and Gilligan-Lee, C. M. (2023).

Estimating categorical counterfactuals via deep twin

networks. Nature Machine Intelligence, 5(2):159–

168.

Yap, J. Y. and Tomlinson, A. (2016). A Causality-Based

Model for Describing the Trustworthiness of a Com-

puting Device. In Yung, M., Zhang, J., and Yang,

Z., editors, Trusted Systems, Lecture Notes in Com-

puter Science, pages 130–149, Cham. Springer Inter-

national Publishing.

Yu, D., Yang, B., Liu, D., Wang, H., and Pan, S. (2021). A

Survey on Neural-symbolic Learning Systems.

Zhang, C., Mohan, K., and Pearl, J. (2023). Causal In-

ference under Interference and Model Uncertainty.

In Proceedings of the Second Conference on Causal

Learning and Reasoning, pages 371–385. PMLR.

ISSN: 2640-3498.

Zhang, C., Zhang, K., and Li, Y. (2020). A Causal View

on Robustness of Neural Networks. In Advances in

Neural Information Processing Systems, volume 33,

pages 289–301. Curran Associates, Inc.

Zhu, S., Ng, I., and Chen, Z. (2020). Causal Discovery with

Reinforcement Learning. In International Conference

on Learning Representations, Online.

MBSE-AI Integration 2024 - Workshop on Model-based System Engineering and Artiﬁcial Intelligence

364