Towards Modelling and Veriﬁcation of Social Explainable AI

Damian Kurpiewski

1,2 a

, Wojciech Jamroga

1,3 b

and Teoﬁl Sidoruk

1,4 c

Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland

Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toru

n, Poland

Interdisciplinary Centre for Security, Reliability, and Trust, SnT, University of Luxembourg, Luxembourg

Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland

Keywords:

Multi-Agent Systems, Formal Veriﬁcation, Social Explainable AI, Strategic Ability, Model Checking.

Abstract:

Social Explainable AI (SAI) is a new direction in artiﬁcial intelligence that emphasises decentralisation, trans-

parency, social context, and focus on the human users. SAI research is still at an early stage. Consequently,

it concentrates on delivering the intended functionalities, but largely ignores the possibility of unwelcome be-

haviours due to malicious or erroneous activity. We propose that, in order to capture the breadth of relevant

aspects, one can use models and logics of strategic ability, that have been developed in multi-agent systems.

Using the STV model checker, we take the ﬁrst step towards the formal modelling and veriﬁcation of SAI

environments, in particular of their resistance to various types of attacks by compromised AI modules.

1 INTRODUCTION

Elements of artiﬁcial intelligence have become ubiq-

uitous in daily life, being involved in social media,

car navigation, recommender algorithms for music

and ﬁlms, and so on. They also provide back-end

solutions to many business processes, resulting in a

huge societal and economical impact. The idea of

Social Explainable AI (SAI) represents an interesting

new direction in artiﬁcial intelligence, which empha-

sises decentralisation, human-centricity, and explain-

ability (Social Explainable AI, CHIST-ERA, 24; Con-

tucci et al., 2022). This is in line with the trend to

move away from classical, centralised machine learn-

ing, not only for purely technical reasons such as scal-

ability constraints, but also to meet the growing ethi-

cal expectations regarding transparency and trustwor-

thiness of data storage and computation (Drainakis

et al., 2020; Ottun et al., 2022). The aim is also to

put the human again in the spotlight, rather than con-

centrate on the technological infrastructure (Conti and

Passarella, 2018; Toprak et al., 2021; Fuchs et al.,

2022).

SAI is a new concept, and a subject of ongoing

research. It still remains to be seen if it delivers AI

solutions that are effective, transparent, and mindful

of the user. To this end, it should be extensively

https://orcid.org/0000-0002-9427-2909

https://orcid.org/0000-0001-6340-8845

https://orcid.org/0000-0002-4393-3447

studied not only in the context of its intended prop-

erties, but also the possible side effects of interaction

that involves AI components and human users in a

complex environment. In particular, we should care-

fully analyse the possibilities of adversarial misuse

and abuse of the interaction, e.g., by means of im-

personation or man-in-the-middle attacks (Dolev and

Yao, 1983; Gollmann, 2011). In those scenarios, one

or more nodes of the interaction network are taken

over by a malicious party that tries to disrupt com-

munication, corrupt data, and/or spread false infor-

mation. Clearly, the design of Social AI must be re-

sistant to such abuse; otherwise it will be sooner or

later exploited. While the topic of adversarial attacks

on machine learning algorithms has recently become

popular (Goodfellow et al., 2018; Kianpour and Wen,

2019; Kumar et al., 2020), the research on SAI has so

far focused only on its expected functionalities. This

is probably because SAI communities are bound to be

conceptually, computationally, and socially complex.

A comprehensive study of their possible unintended

behaviors is a highly challenging task.

Here, we propose that formal methods for multi-

agent systems (Weiss, 1999; Shoham and Leyton-

Brown, 2009) provide a good framework for multi-

faceted analysis of Social Explainable AI. Moreover,

we put forward a new methodology for such studies,

based on the following hypotheses:

1. It is essential to formalise and evaluate multi-

agent properties of SAI environments. In particu-

396

Kurpiewski, D., Jamroga, W. and Sidoruk, T.

Towards Modelling and Veriﬁcation of Social Explainable AI.

DOI: 10.5220/0011799900003393

In Proceedings of the 15th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2023) - Volume 1, pages 396-403

ISBN: 978-989-758-623-1; ISSN: 2184-433X

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

lar, we must look at the properties of interaction

between SAI components that go beyond joint,

fully orchestrated action towards a common pre-

deﬁned goal. This may include various relevant

functionality and safety requirements. In partic-

ular, we should assess the impact of adversarial

play on these requirements.

2. Many of those properties are underpinned by

strategic ability of agents and their groups to

achieve their goals (Pauly, 2002; Alur et al., 2002;

Bulling et al., 2015). In particular, many func-

tionality properties refer to the ability of legiti-

mate users to complete their selected tasks. Con-

versely, safety and security requirements can be

often phrased in terms of the inability of the “bad

guys” to disrupt the behavior of the system.

3. Model checking (Clarke et al., 2018) provides

a well-deﬁned formal framework for the analy-

sis. Moreover, existing model checking tools for

multi-agent systems, such as MCMAS (Lomuscio

et al., 2017) and STV (Kurpiewski et al., 2021)

can be used to formally model, visualise, and

analyse SAI designs with respect to the relevant

properties.

4. Conversely, SAI can be used as a testbed for

cutting-edge methods of model checking and their

implementations.

In the rest of this paper, we make the ﬁrst step

towards formal modelling, speciﬁcation, and veriﬁ-

cation of SAI. We model SAI by means of asyn-

chronous multi-agent systems (AMAS) (Jamroga et al.,

2020), and formalise their properties using formulas

of temporal-strategic logic ATL

∗

(Alur et al., 2002;

Schobbens, 2004). For instance, one can specify that

a malicious AI component can ensure that the remain-

ing components will never be able to build a global

model of desired quality, even if they all work to-

gether against the rogue component. Alternatively,

strategies of the “good” modules can be considered,

in order to check whether a certain threshold of non-

compromised agents is sufﬁcient to prevent a spe-

ciﬁc type of attack. Finally, we use the STV model

checker (Kurpiewski et al., 2021) to verify the for-

malised properties against the constructed models.

The veriﬁcation is done by means of the technique of

ﬁxpoint approximation, proposed and studied in (Jam-

roga et al., 2019).

Note that this study does not aim at focused in-

depth veriﬁcation of a speciﬁc machine learning pro-

cedure, like in (Wu et al., 2020; Batten et al., 2021;

Kouvaros and Lomuscio, 2021; Akintunde et al.,

2022). Our goal is to represent and analyse a broad

spectrum of interactions, possibly at the price of ab-

straction that leaves many details out of the formal

model.

The ideas, reported here, are still work in progress,

and the results should be treated as preliminary.

2 SOCIAL EXPLAINABLE AI

The framework of Social Explainable AI or SAI (So-

cial Explainable AI, CHIST-ERA, 24; Contucci et al.,

2022; Fuchs et al., 2022) aims to address several

drawbacks inherent to the currently dominant AI

paradigm. In particular, state of the art machine learn-

ing (ML)-based AI systems are typically centralised.

The sheer scale of Big Data collections, as well as

the complexity of deep neural networks that process

them, mean that effectively these AI systems act as

opaque black boxes, non-interpretable even for ex-

perts. This naturally raises issues of privacy and trust-

worthiness, further exacerbated by the fact that stor-

ing an ever-increasing amount of sensitive data in a

single, central location might eventually become un-

feasible, also for non-technical reasons such as local

regulations regarding data ownership.

In contrast, SAI envisions novel ML-based AI sys-

tems with a focus on the following aspects:

• Individuation: a “Personal AI Valet” (PAIV) asso-

ciated with each individual, acting as their proxy

in a complex ecosystem of interacting PAIVs;

• Personalisation: processing data by PAIVs via ex-

plainable AI models tailored to the speciﬁc char-

acteristics of individuals;

• Purposeful interaction: PAIVs build global AI

models or make collective decisions starting from

the local models by interacting with one another;

• Human-centricity: AI algorithms and PAIV inter-

actions driven by quantiﬁable models of the indi-

vidual and social behaviour of their human users;

• Explainability by design: extending ML tech-

niques through quantiﬁable human behavioural

models and network science analysis.

The current attempts at building SAI use gossip

learning as the ML regime for PAIVs (Social AI

gossiping. Micro-project in Humane-AI-Net, 2022;

Heged

us et al., 2019; Heged

us et al., 2021). An exper-

imental simulation tool to assess the effectiveness of

the process and functionality of the resulting AI com-

ponents is available in (Lorenzo et al., 2022). In this

paper, we take a different path, and focus on the multi-

agent interaction in the learning process. We model

the network of PAIVs as an asynchronous multi-agent

system (AMAS), and formalise its properties as formu-

las of alternating-time temporal logic (ATL

∗

). The

Towards Modelling and Veriﬁcation of Social Explainable AI

397

formal framework is introduced in Section 3. In Sec-

tion 4, we present preliminary multi-agent models of

SAI, and show several attacks that can be modelled

that way. In Section 5, we formalise several proper-

ties and conduct model checking experiments.

3 WHAT AGENTS CAN ACHIEVE

In this section, we introduce the formalism of Asyn-

chronous Multi-agent Systems (AMAS) (Jamroga

et al., 2020; Jamroga et al., 2021), as well as the

syntax and semantics of Alternating-time Temporal

Logic ATL

∗

(Alur et al., 2002; Schobbens, 2004),

which allows for specifying relevant properties of SAI

models, in particular the strategic ability of agents to

enforce a goal.

3.1 Asynchronous MAS

AMAS can be thought of as networks of automata,

where each component corresponds to a single agent.

Deﬁnition 1 (AMAS (Jamroga et al., 2021)). An

asynchronous multi-agent system (AMAS) consists of

n agents A = {1, . . . , n}, each associated with a 7-

tuple A

= (L

, ι

, Evt

, R

, T

, PV

), where:

• L

= {l

, . . . , l

} 6=

0 is a ﬁnite set of local states;

• ι

∈ L

is an initial local state;

• Evt

= {e

, . . . , e

} 6=

0 a ﬁnite set of events;

• R

: L

→ 2

Evt

\ {

0} is a repertoire of choices, as-

signing available subsets of events to local states;

• T

: L

× Evt

* L

is a (partial) local transi-

tion function that indicates the result of executing

event e in state l from the perspective of agent i.

, e) is deﬁned iff e ∈

);

• PV

is a set of the agent’s local propositions, with

, PV

(for j 6= k ∈ A ) assumed to be disjoint;

• V

: L

→ P (PV

) is a valuation function.

Furthermore, we denote:

• by Evt =

i∈A

Evt

, the set of all events;

• by L =

i∈A

, the set of all local states;

• by Agent(e) = {i ∈ A | e ∈ Evt

}, the set of all

agents which have event e in their repertoires;

• by PV =

i∈A

the set of all local propositions.

The model of an AMAS provides its execution

semantics with asynchronous interleaving of private

events and synchronisation on shared ones.

Deﬁnition 2 (Model (Jamroga et al., 2021)). The

model of an AMAS is a 5-tuple M = (A , S, ι,T,V ),

where:

• A is the set of agents;

• S ⊆ L

×. .. ×L

is the set of global states, includ-

ing all states reachable from ι by T (see below);

• ι = (ι

, . . . , ι

) ∈ S is the initial global state;

• T : S × Evt ∪ {ε} * S is the (partial) global

transition function, deﬁned by T (s

, e) = s

iff

, e) = s

for all i ∈ Agent(e) and s

= s

for

all i ∈ A \ Agent(e), where s

∈ L

is agent i’s

local component of s

. Moreover, T (s, ε) = s iff

there are events e

, . . . , e

st. T

, e

) is deﬁned

but none of e

, . . . , e

is selected by all its owners;

• V : S → 2

is the global valuation function, de-

ﬁned as V (l

, . . . , l

) =

i∈A

3.2 Strategic Ability

Linear and branching-time temporal logics, such as

LTL and CTL

(Emerson, 1990), have long been

used in formal veriﬁcation. They enable to express

properties about how the state of the system will (or

should) evolve over time. However, in systems that

involve autonomous agents, whether representing hu-

man users or AI components it is usually of interest

who can direct its evolution a particular way.

ATL

∗

(Alur et al., 2002) extends temporal log-

ics with strategic modalities that allow for reasoning

about such properties. The operator hhAiiγ says that

agents in group (coalition) A have a strategy to en-

force property γ. That is, as long as agents in A select

events according to the strategy, γ will hold no matter

what the other agents do. ATL

∗

has been one of the

most important and popular agent logics in the last 25

years.

Deﬁnition 3 (Syntax of ATL

∗

). The language of

ATL

∗

is deﬁned by the grammar:

ϕ ::= p | ¬ϕ | ϕ ∧ ϕ | hhAiiγ,

γ ::= ϕ | ¬γ | γ ∧ γ | X γ | γ U γ,

where p ∈ PV and A ⊆ A. The deﬁnitions of Boolean

connectives and temporal operators X (“next”) and

U (“strong until”) are standard; remaining operators

R (“release”), G (“always”), and F (“sometime”)

can be derived as usual.

Various types of strategies can be deﬁned, based

on the state information and memory of past states

available to agents (Schobbens, 2004). In this work,

we focus on imperfect information, imperfect recall

strategies.

Deﬁnition 4 (Strategy). A memoryless imperfect in-

formation strategy for agent i ∈ A is a function

: L

→ 2

Evt

0 such that σ

(l) ∈ R

(l) for each local

state l ∈ L

. A joint strategy σ

of coalition A ⊆ A is

a tuple of strategies σ

, one for each agent i ∈ A.

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

398

The outcome set of a strategy collects all the

execution paths consistent with the strategy. For-

mally, out

(s, σ

) collects all the inﬁnite sequences

of states, starting from s, that may occur when the

coalition follows strategy σ

while the opponents

choose freely from their repertoires. We use the so

called opponent-reactive outcome, where the oppo-

nents are assumed to respond with matching synchro-

nization events if such responses are available. The

interested reader is referred to (Jamroga et al., 2021;

Kurpiewski et al., 2022) for the discussion and tech-

nical details.

Deﬁnition 5 (Asynchronous semantics of ATL

∗

(Jamroga et al., 2020)). The asynchronous semantics

of the strategic modality in ATL

∗

is deﬁned by the

following clause:

M, s |= hhAiiγ iff there is a strategy σ

such that

out

(s, σ

) 6=

0 and, for each path π ∈ out

(s, σ

we have M, π |= γ.

The remaining clauses for temporal operators and

Boolean connectives are standard, see (Emerson,

1990).

4 MODELS

The ﬁrst step towards the veriﬁcation of the interac-

tion between agents in Social Explainable AI is a thor-

ough and detailed analysis of the underlying protocol.

We begin by looking into the actions performed and

the messages exchanged by the machines that take

part in the learning phase. Then, we can start design-

ing multi-agent models. Usually, such systems are too

complex to be modelled as they are. In that case, we

create an abstract view of the system.

4.1 Agents

In this work, we focus on the learning phase of the

SAI protocol. We model each machine equipped with

an AI module as a separate agent. The local model of

an AI agent consists of three phases: the data gather-

ing phase, the learning phase and the sharing phase.

Data Gathering Phase. In this phase, the agent is

able to gather the data required for the learning phase.

The corresponding action can be performed multiple

times, each time increasing the local variable that rep-

resents the amount of gathered data. At the end of the

phase, the amount of gathered data is analysed and,

depending on the exact value, the agent’s prepara-

tion is marked as incomplete, complete, or excessive.

From this, the agent proceeds to the learning phase.

Learning Phase. Here, the agent can use the previ-

ously gathered data to train its local AI model. The

effectiveness of the training depends on the amount

of gathered data. Excessive data means that the model

can be easily overtrained, while insufﬁcient data may

lead to more iterations required to properly train the

model. The training action can be performed multiple

times each time increasing the local variable related

to the quality of the model. At the end of this phase

the internal AI model can be overtrained, undertrained

or properly trained. After this phase, the agent is re-

quired to share its model with other agents.

Sharing Phase. Agents share their local AI mod-

els with each other following a simple sharing pro-

tocol. The protocol is based on packet traversal in the

ring topology. Each agent receives the model from

the agent with previous ID and sends its model to

the agent with next ID, while the last agent shares its

model with the ﬁrst agent to close the ring. In order

to avoid any deadlocks, each agent with odd ID ﬁrst

receives the model and then sends its own, and each

agent with even ID ﬁrst sends its own model and then

receives the model from the agent before him.

When receiving the model, the agent can either

accept it or reject it, and its decision is based on the

quality of the model being shared. After accepting

the model, the agent merges it with its own and the

resulting model quality is the maximum of the two.

After the sharing phase, the agent can go back to

the learning phase to further train its model.

To formalize the details of the procedure, we have

utilised the open-source experimental model checker

STV (Kurpiewski et al., 2021), which was used, e.g.,

to model and verify the real-world voting protocol Se-

lene (Kurpiewski et al., 2022). Figure 1 presents a

detailed representation of an honest AI component as

an AMAS (left) and the STV code specifying its be-

havior (right). Figure 2 shows the visualization of the

component, produced by the tool.

4.2 Attacks

The model described in Section 4.1 reﬂects the ideal

scenario in which each agent is honest and directly

follows the protocol. Of course, it is not always the

case. A machine can malfunction, and take actions

not permitted by the protocol. Also, an agent can be

infected by malicious software, and function improp-

erly. This leads to two possible attack scenarios: the

man in the middle attack and the impersonator attack.

Man in The Middle. Assume the existence of an-

other, dishonest agent, called the intruder. This agent

does not participate in the data gathering and learning

phases, but it is particularly interested in the sharing

Towards Modelling and Veriﬁcation of Social Explainable AI

399

start gathering

[AI1 data < 1] stop gathering

AI1 data = 0, AI1 completion = 1

[1 <= AI1 data < 2] sto p gathering

AI1 data = 0, AI1 completion = 2

[2 <= AI1 data] stop gathering

AI1 data = 0, AI1 completion = 3

skip gathering

start learning

[AI1 info < 1 ∧ AI1 mqual > 0] stop learning

AI1 info = 0, AI1 mstatus = 1, AI1 mqual− = 1

[AI1 info < 1 ∧ AI1 mqual ≤ 0] stop learning

AI1 info = 0, AI1 mstatus = 1

[1 ≤ AI1 info < 2 ∧ AI1 mqual < 2] stop learning

AI1 info = 0, AI1 mstatus = 2, AI1 mqual+ = 1

[1 ≤ AI1 info < 2 ∧ AI1 mqual ≥ 2] stop learning

AI1 info = 0, AI1 mstatus = 2

[2 ≤ AI1 info ∧ AI1 mqual > 0] stop learning

AI1 info = 0, AI1 mstatus = 3, AI1 mqual− = 1

[2 ≤ AI1 info ∧ AI1 mqual ≤ 0] stop learning

AI1 info = 0, AI1 mstatus = 3

skip learning

start sharing

share 3 with 1

AI1 mqual = %AI3 mqual

share 1 with 2 end sharing

repeat

[AI1 data < 2] gather data

AI1 data+ = 1

[AI1 info < 2] keep learning

AI1 info+ = AI1 completion

wait

Ag ent AI1 :

i n i t : s t a r t

%% −−−Phase1 : G a t h e ri n g d a t a −−−

s t a r t g a t h e r i n g d a t a : s t a r t −> g a t h e r

g a t h e r d a t a : g a t h e r −[ AI 1 d a ta <2]> g a t h e r

[ A I 1 d a ta += 1]

%% 1: I n co m pl e t e d a t a

s t o p g a t h e r i n g d a t a : g a t h e r −[ AI 1 d a ta < 1]> d a t a r e a d y

[ A I 1 d a ta = 0 , A I 1 d a t a c o m p l et i o n = 1]

%% 2: C omp l e te d a t a

s t o p g a t h e r i n g d a t a : g a t h e r −[1 <= A I1 d a t a < 2]> d a t a r e a d y

[ A I 1 d a ta = 0 , A I 1 d a t a c o m p l et i o n = 2]

%% 3: Too much d at a

s t o p g a t h e r i n g d a t a : g a t h e r −[2 <= A I1 d a t a]> d a t a r e a d y

[ A I 1 d a ta = 0 , A I 1 d a t a c o m p l et i o n = 3]

s k i p g a t h e r i n g d a t a : s t a r t −> d a t a r e a d y

%% −−−Phase2 : L ea r n in g ( b u i l d i n g l o c a l model)−−−

s t a r t l e a r n i n g : d a t a r e a d y −> l e a r n

k e e p l e a r n i n g : l e a r n −[ AI 1 i n f or m a t i o n < 2]> l e a r n

[ A I 1 i nf o r m a t io n += A I 1 d a t a c o m p l e t i on ]

%% 1: I n co m pl e t e model