Lina Ye and Philippe Dague
LRI, Univ. Paris-Sud, CNRS and Gemo, INRIA Saclay, Parc Club Orsay Universit´e
4 rue Jacques Monod, bˆat G, Orsay, F-91893, France
BPEL Web services, Petri nets, consistency-based diagnosis, causal model, decentralized diagnosis.
This paper proposes a decentralized diagnosis approach for BPEL Web services, where a local diagnoser is
provided to each BPEL service and should cooperate with a coordinator. It should noted that there is no global
model in this approach but local model for each service. The strategy of local consistency-based diagnosis
consists in abstracting diagnostic knowledge base from data dependencies contained in the enriched BPEL
Petri net model of each service and then reasoning from the observations with a set of local possible source
faults. The coordinator has to put together all the local diagnosis information from local diagnosers and infer
global diagnoses.
Self-healing software is one of the important
challenges for Information Society Technologies
research. Our paper proposes a decentralized
diagnosis approach for BPEL Web services,
in the context of the EU project WS-Diamond
(, whose goal is to
design a framework for self-healing Web services
by adopting artificial intelligence methodologies to
solve the diagnosis problem by supporting online
detection and identification of faults.
Considering a complex Web service, due to its
data oriented nature, we focus on data semantic faults,
which are the most difficult to identify but the most
critical by their consequences since Web services are
components that use messages to interact with each
other and messages are mostly a data structure. In
fact, many subtle faults on data cannot be identified
at the time when they happen and are thus difficult
to diagnose. Generally, data are propagated through
the interactions and then are used to elaborate other
data or control decisions, like branching conditions.
In other words, during the transmission through part-
ner instances, data may propagate a dysfunction until
an exception is thrown when a logical data contradic-
tion is detected. The contradictioncan be either a mis-
match of some data values or some wrong data. This
kind of dependencymakes it difficult to predict the re-
lationship between the source fault and the symptom.
For example, let us consider a data fault caused by a
discrepancy in the semantic of the date format: in a
travel agency service, a customer defines his itinerary
by French format, like from (6/2/07) to (10/2/07)
for hotel reservation, while hotel service uses English
format for interpretation. Then the user will raise an
exception, saying that the amount of the hotel cost is
exaggerated (4 months instead of 4 days). To detect
and explain such kind of faults, model-based diagno-
sis (Hamscher et al., 1992) is adopted here owing to
its capability of detecting more effectively the unan-
ticipated or hidden faults in the context of Web ser-
vices. However, it should be noted that Web services
are a novel area of applications for model-based diag-
nosis and no such characterization is available from
the literature.
The rest of this paper is organized as fol-
lows. Section 2 briefly presents the modeling of
BPEL(Business Process Execution Language) ser-
vices. In section 3, a decentralized diagnosis ap-
proach is proposed, including the strategy of local di-
agnosis based on consistency-based perspective and a
protocol for global diagnosis. At last the conclusion
is given in section 4.
BPEL is a standard Web service composition lan-
guage, defined to support the development of com-
Ye L. and Dague P. (2008).
In Proceedings of the Fourth International Conference on Web Information Systems and Technologies, pages 283-287
DOI: 10.5220/0001525402830287
plex applications based on the orchestration of sim-
pler ones. It offers a set of structured activities to
order the execution of the basic ones and thus con-
trol the process flow. (Li et al., 2007) has already
precisely described a method for generating automat-
ically enriched BPEL Petri net models with data de-
pendencies from their BPEL code. The essential idea
is to use places to represent the most elementary parts
that compose a message (using its Xpath decomposi-
tion) as well as the activation status and to use tran-
sitions to represent activities. More importantly, to
enrich each transition of the BPEL Petri net with a set
of dependency relations between its input and output
parameters, three data dependency relations for Web
service diagnosis at high level are considered, defined
in (Ardissono et al., 2005): FW(a,x, y) describes the
relation that the output variable y of the activity a is a
copy of the input variable x, regardless of the correct-
ness of the behavior of a; SRC(a,y) means that the
output variable y is created by the activity a, indepen-
dently of its input variables; EL(a, {x
},y) ex-
presses the case that the output variable y is computed
from the input variables x
by the activity a.
By modeling BPEL services in this way, it is possible
to get all data dependencies in services such that our
diagnosis result concerns all possible source faults.
Web services, as all other applications, are subject
to dysfunctions, such as a faulty composed service,
an incoming message mismatching the interface, etc.
The symptom is that exceptions are thrown at the
places where the process cannot be executed. To han-
dle these faults, the current mechanism is a throw-
and-catch one, which is very preliminary since it
relies on the empirical knowledge of the developer
while various causes of the exception may be un-
known to him. So, with this mechanism, we cannot
obtain a sound and complete diagnosis, especially for
semantic faults. For example, when an exception is
thrown, then not only the service that throws the ex-
ception should be suspected, but the one that gener-
ates the data and also all the services that modify the
data should be suspected. Whereas a current Web ser-
vice exception can only report about where the ex-
ception happens. As said before, since the seman-
tic faults are the most difficult and the most critical
ones to diagnose, our approach focus on determin-
ing the exact source faults that are possibly respon-
sible for this kind of exceptions. For this, three types
of source faults, which may cause semantic faults in
Web services, are taken into account: a basic Web
service (black box) with abnormal behavior; a wrong
input data from the user; and an interface fault, which
means that the shared messages between two coop-
erating Web services become different and thus the
dysfunction can be engendered as a result.
3.1 Architecture
We propose a decentralized diagnosis approach in
this paper, considering its efficiency in Web services
context. Figure 1 describes our architecture. For
each BPEL service W
, there is a local diagnoser
performing local diagnosis inference by adopting
consistency-based approach, based on the model of
as well as local observations, that are logged by
the monitoring platform. In addition, there is also a
coordinator, which is considered as a separated Web
service that receives local diagnosis results from local
diagnosers to decide about the global diagnoses by re-
solving local diagnosis conflicts. Here it is worth not-
ing that there is no global model for the whole BPEL
process, since the coordinator knows nothing about
local models but their connections. In this sense our
approach is called a decentralized one. Obviously, in
Figure 1: Decentralized diagnosis architecture.
this architecture, for the organization owning the or-
chestrated Web service, the privacy issues can be sat-
isfied due to the fact that the model of W
can only be
inspected by the associated local diagnoser D
, which
does not directly interact with other local diagnosers.
3.2 Local Diagnosis for BELL Services
Reiter’s logical theory of diagnosis (Reiter, 1987) is
also referred to as consistency-based diagnosis. In
this approach, a behavioral system model is defined
as a tuple (SD,COMP): SD, the system description,
is constituted of a finite set of first order sentences
describing explicitly the behavior of each given com-
ponent; COMP is a finite set of constants, meaning
the components that could be faulty. In SD, a distin-
guished predicate
means abnormal. For exam-
ple, ab(c) denotes an abnormal component c, while
WEBIST 2008 - International Conference on Web Information Systems and Technologies
¬ab(c) expresses the component c working correctly.
An observed behavioral system model is expressed
as a tuple (SD,COMP,OBS), where (SD,COMP) is a
system model and OBS is a set of first order formulas
expressing the observations. Generally, a diagnosis is
considered as a hypothesis that certain components of
the system are behaving abnormally, which should be
consistent with what is known about the system and
with the observations. A diagnosis is called minimal
diagnosis, if and only if no proper subset of it is a di-
agnosis at the same time. Usually minimal diagnoses
are often the preferred ones, considering the principle
of parsimony.
Definition 1. A minimal diagnosis for an observed
system (SD, COMP, OBS) is a minimal subset of
COMP such that:
SD OBS {ab(c)|c △}
ab(c)|c COMP\ }
is consistent.
The above behavioral system model description
SD can be simplified by using causal system model
description SD
(Bauer, 2005). Differently from SD,
describes the causal relation that if all input data
and the component/activity itself are correct, then all
its outputs are correct. However, this kind of causal
relations are not precise enough. For Web service di-
agnosis, owing to the availability of the three depen-
dency relations, described in 2, we can get a similar
but more precise system description, denoted as SD
Specially, the causal relations in SD
are obtained
from every data dependency(FW,SRC, EL), while the
causal relations in SD
are acquired only from every
Since we have a set of data dependenciesin the en-
riched BPEL Petri net model, it is crucial to transform
them into causal logic formulas for the consistency-
based local diagnosis. Tables 1, 2, 3 describe the
causal relations between input and output parameters
for all kinds of activities by translating each data de-
pendency into causal logic formula. Considering ta-
ble 1, it can be seen that ¬ab(a) in the logic formula
is not concerned. The reason is that in our case, BPEL
code is supposed to be correct, which means that the
behavior of basic BPEL activity is assumed to be nor-
mal, since we focus onlyon semantic faults and not on
program debugging. In addition, due to the fact that
when an exception is thrown, multiple causes could
be possible to explain this exception, depending on
which branch being taken, we adopt online diagnosis
to infer the exact possible sources of the fault. For
this, obs(a), which expresses that basic BPEL activ-
ity a is observed, should be added into the logic for-
mulas. In this way, the logged information of our
monitoring system, like the observed activities, can
be used in the diagnosis process. For example, for the
dependency SRC(a,y), the corresponding logic for-
mula means that if input activation status of activity
a is correct and a is observed, then output y is correct.
Similarly, table 2 can be obtained. The difference is
that for Invoke activity, we have to take the invoked
service into account for SRC and EL dependencies,
since the abnormal behavior of the invoked service
affects the output parameters. Here for the sake of
simplicity, the correct behavior of invoked service,
invoked by Invoke activity a, is denoted as ¬ab(a).
Table 3 is for an additional type of activity, control
activity, which is not a basic BPEL activity and thus
cannot be observed by our monitoring system. So in
3, obs(a) is not concerned. It can be noticed that con-
trol activity has no output data place but only one out-
put activation place, meaning that the role of control
activity is just to control the process flow.
Table 1: The transformation for basic BPEL activity except
Dependencies Causal logic formulas
¬ab( ¬ab(x)
obs(a) ¬ab(y)
¬ab( obs(a)
¬ab( ¬ab(x
) obs(a) ¬ab(y)
Table 2: The transformation for Invoke activity.
Dependencies Causal logic formulas
¬ab( ¬ab(x)
obs(a) ¬ab(y)
¬ab( ¬ab(a)
obs(a) ¬ab(y)
¬ab( ¬ab(x
) ¬ab(a)
obs(a) ¬ab(y)
Table 3: The transformation for control activity.
Dependencies Causal logic formulas
FW(a,,a.out) ¬ab( ¬ab(a.out)
EL(a,{, x},
¬ab( ¬ab(x)
For each enriched BPEL Petri net model of a
BPEL service, a set of logic formulas, actually a set of
Horn clauses, can thus be obtained from a set of data
dependencies available in the model. This set of logic
formulas is from now on called Diagnostic Knowl-
edge Base (DKB). When applying the consistency-
based diagnosis technique for complex physical sys-
tems to Web services, this DKB facilitates the calcu-
lation of the minimal diagnoses with OBS and with
the complementary knowledge of a set of possible
sources whose faults can be used to explain the ex-
ception, denoted as PSF. Due to the assumption of
correct BPEL code, we have the following definition:
Definition 2. PSF is the set of possible source faults
for a local BPEL service, concerning three types:
Incorrect input data x from user, denoted as ab(x)
Faulty basic Web service, invoked by Invoke activ-
ity a, denoted as ab(a)
Wrong input variable y in Web service W
from shared variable y
in another composite Web
service W
, denoted by ab(y).
For the last one, faulty input variable y from an-
other service, it can be temporally considered as local
possible source fault but will be exploited later. De-
tails will be presented in 3.3. Then with DKB and
PSF, similar to definition 1, we have:
Definition 3. A minimal diagnosis for an observed
BPEL service (DKB,PSF,OBS) is a minimal subset
of PSF such that:
DKB OBS {ab(c)|c △}
ab(c)|c PSF \ △}
is consistent.
3.3 Protocol for Global Diagnosis
In our decentralized diagnosis architecture, each local
diagnoser can interact both with its associated Web
service and with the coordinator, while the coordina-
tor can interact only with local diagnosers. Moreover,
due to the privacy issues, our coordinator C does not
initially have any information on the individual Web
services except their connections, which are obtained
offline and are at interface level.
This decentralized diagnosis process is started by
a local diagnoser D
that is awakened by an excep-
tion in Web service W
. The whole steps are as the
1. D
infers the local diagnoses for the observed sys-
tem {DKB
} and sends its result to C.
2. C extends the diagnoses received from D
by pro-
viding each element a of every diagnosis (here
for the sake of generalization, minimal diagnosis
could be single or multiple, which is thus con-
sidered as a list) with local diagnoser D a that
can further explain ab(a). If there is no such lo-
cal diagnoser, then D a is assigned null. When
the diagnosis is = {y}, where y is input data
coming from y
in another composite Web ser-
vice W
, then the result of extension should be
= {Inf(y
,y), null} or
= {y
}, due to
the fact that abnormal y could be caused by the
interface fault or by the local possible source fault
in W
. Here ab(In f(y
,y)) denotes the fault of
interface between y in W
and y
in W
when the
value is transmitted from y
to y. In particular,
= {Inf(y
,y), null}, the observed informa-
tion can be used to verify or disapprove it. For
example, if the values of the shared variables are
observed different, then this interface fault is ver-
ified. Otherwise, it should be excluded. When
the diagnosis is = {a}, a being input data from
user or Invoke activity invoking a basic Web ser-
vice, then it is extended as
= {a,null} , since
no local diagnoser can be further required to ex-
plain ab(a).
3. If there exist other local diagnosers for further re-
quest, C triggers one, for example D
, by sending
) as an exception in W
. D
then performs
local diagnosis and sends its result to C.
4. C extends the diagnoses from D
and then makes
diagnosis update by replacing {y
} with the
extended result of D
. After this, C continues to
look for next local diagnoser. If there is any, the
process turns to step 3. Otherwise, the process
terminates. Finally, we can get all the global diag-
noses from the final diagnoses in C.
Alg1 formally illustrates our diagnosis process.
Table 4: Meaning of major symbols in the algorithm.
Symbol Meaning
H or H a
a list of candidate minimal di-
agnoses, subsets of (
a set of interfaces between
shared variables in different ser-
a set of input variables in PSF
from another composite Web
returns the first element of a list
removes an element from the
adds an element to a list
returns a set of local minimal
diagnoses for W
inferred by D
with the exception on Var
extends a list, details precisely
described in step2
H.update(a,H a)
returns the result of replacing a
with H a in H
Alg 1. Input: Variable y
in W
where rises the excep-
tion. Output: F, the list of global minimal diagnoses
, where
is a subset of ((
) \
). Initially, F is empty.
H = D
while H! = null do
{T =;
WEBIST 2008 - International Conference on Web Information Systems and Technologies
if (for each element a in T, D a = null), then
for each element a in T, do
if (D a! = null), then
{H a = D a(a);
H a.extend();
H.update(a, H a);}
return F
In this paper, a cooperative decentralized diagno-
sis approach for complex Web services is proposed.
BPEL Web services are chosen as our application due
to their popularity and perspective. For each individ-
ual activity, grey box is adopted, which means that we
do not model its internal behavior but the correlation
between its input and output parameters. Thus we can
infer how the correct/incorrect status of input param-
eters can affect the correct/incorrect status of output
parameters. Obviously, our approach greatly reduces
the computation complexity thanks to local diagno-
sis algorithm relying on Horn clauses inference and
global diagnosis based on decentralized architecture.
The details of our experimentations on real BPEL ex-
amples (in the framework of project WS-Diamond)
are omitted due to lack of space. In addition, our
approach can be easily extended to handle multiple
exceptions, especially for independent exceptions in
different paralleled branches. In this case, each ex-
ception should be diagnosed independently and then
the synthesized diagnoses are the union of all the di-
agnoses. Since we focus on the minimal diagnoses,
we remove the synthesized diagnoses that are super-
sets of other ones. Furthermore, it is straightforward
to extend our diagnosis architecture to multi-layered
hierarchies. For example, a coordinator can be de-
signed to be able to act as a local diagnoser for an-
other coordinator at higher level.
There are similar approaches in the literature. In
(Bauer, 2005), the problem of contradicting first or-
der system descriptions with observations is reduced
to propositional logic, which is similar to our DKB.
However, they have experiencedk-satisfiability by us-
ing state-of-the-art SAT solvers to determine conflict
sets and minimal diagnoses, which is avoided in our
approach since our DKB is made up of Horn clauses
and thus permits direct deduction. (Ardissono et al.,
2005) has proposed a similar decentralized model-
based diagnosis approach for Web services. Their
global diagnoser does not initially have any informa-
tion on the individual Web services such that the com-
munications between local diagnoser and global di-
agnoser should contain the information about diag-
nosis and interactions (like connection information).
Differently, our coordinator has the knowledge about
the connections between services, which lightens the
communication flow since just diagnosis information
should be considered. However, this knowledge does
not violate the privacy issues since it is at interface
level and thus the coordinator still does not know the
internal details of services. In addition, they just have
proposed the characterization of local diagnoser op-
erations without providing specific algorithms. While
in ours, since BPEL services are chosen as the appli-
cation, local diagnosis algorithm is precisely defined.
(Yan and Dague, 2007) has introduced an approach
similar in its principle but different in its implementa-
tion: automata are used instead of Petri nets for mod-
eling; trajectories in the synchronized product of the
automaton model and the observations are used in-
stead of data dependencies; the diagnostic algorithm
is centralized instead of being decentralized.
Ardissono, L., Console, L., Goy, A., Petrone, G., Picardi,
C., Segnan, M., and Dupr´e, D. T. (2005). Enhanc-
ing web services with diagnostic capabilities. Proc. of
European Conference on Web Services (ECOWS-05),
pp. 182-191, Vaxjo, Sweden, IEEE.
Bauer, A. (2005). Simplifying diagnosis using lsat: a propo-
sitional approach to reasoning from first principles.
In vol 3524 of Lecture Notes in Computer Science.
Proc. of the 2005 International Conference on Inte-
gration of AI and OR Techniques in Constraint Pro-
gramming for Combinatorial Optimization Problems
(CP-AI-OR), Springer-Verlag.
Hamscher, W., Console, L., and de Kleer, J. (1992). Read-
ings in Model-based Diagnosis. Morgan Kaufmann.
Li, Y., Melliti, T., and Dague, P. (2007). Modelling bpel
web services for diagnosis: towards self-healing web
services. Proc. of the 3rd International Conference
on Web Information Systems and Technologies, pages
297-304, Barcelona, Spain.
Reiter, R. (1987). A theory of diagnosis from first princi-
ples. Artificial Intelligence, 32(1), 57-96.
Yan, Y. and Dague, P. (2007). Monitoring and diagnosing
orchestrated web service processes. Proc. of the 2007
IEEE International Conference on Web Services.