Secrecy-preserving Reasoning in ELH Knowledge Bases using
MapReduce Algorithm
Gopalakrishnan Krishnasamy Sivaprakasam
1
and Giora Slutzki
2
1
Department of Mathematics & Computer Science, Central State University, Wilberforce, Ohio, U.S.A.
2
Department of Computer Science, Iowa State University, Ames, Iowa, U.S.A.
Keywords:
Secrecy Preserving Reasoning, Knowledge Bases, MapReduce Algorithm.
Abstract:
In this paper, we have used the MapReduce algorithm to study the problem of secrecy-preserving reasoning
in a very large ELH knowledge bases. A tableau algorithm for ABox reasoning is designed in a way that is
suitable for MapReduce framework and contains a small set of reasoning rules. To implement the paralleliza-
tion method, we have designed map and reduce functions for each of these ABox reasoning rules. The output
of this computational procedure is a finite set A
which contains assertional consequences of the given knowl-
edge base. Given a finite secrecy set S, we compute a set E, called an envelope of S, which provides logical
protection to the secrecy set S against the reasoning of a querying agent. To compute E, a tableau algorithm
is designed by inverting ABox reasoning rules in a way that is suitable for MapReduce framework. Further, to
implement the parallelization method, we have designed map and reduce functions for each of these inverted
rules.
1 INTRODUCTION
In the last three decades, beginning with the internet
era, there has been a dramatic increase in the volume
of mobile devices and internet users whose daily web
activities include online banking activities, social net-
working, web based travel services and other inter-
net based business applications. These activities con-
tribute to the generation of unprecedented amount of
web data. In (Reinsel et al., 2017), the authors re-
ported that estimated amount of web data would be
around 17 zettabytes and they forecasted that the vol-
ume of web data would grow upto 160 zettabytes by
2025. Web data contains a massive amount of pri-
vate information about users, administrators, service
providers, governmental agencies and military estab-
lishments which must be maintained and protected.
To deal with this emerging challenge it is imperative
to design and develop robust Bigdata technology tools
to secure the confidential information in ways that do
not deter the main web objective of sharing the non-
confidential information among users.
The logic based languages like Resource De-
scription Framework (RDF) and Web Ontology Lan-
guages (OWL) have been developed exclusively for
creating knowledge bases (ontologies) and design-
ing efficient procedures to reasoning and answering
queries on these web databases. RDF and OWL
are prescribed languages for Semantic web tech-
nologies recommended by World Wide Web consor-
tium (W3C). Recently, several software tools like
Bigdata
R
, LargeTripleStores and Blazegraph
TM
have
been developed specifically for Bigdata applications,
and these tools support RDF and OWL formats. In
addition, a number of research works have been re-
ported on Big data applications in Semantic Web us-
ing RDF and OWL approaches, see (Konstantopoulos
et al., 2016; Zhou et al., 2013; Hitzler and Janowicz,
2013).
Description Logic (DL) is a decidable fragment of
First Order Logic (FOL). In Semantic web research,
DL languages play a central role in building Knowl-
edge Bases (KBs) for social, biological and medi-
cal applications, and reasoning and answering queries
over KBs. Further, DL languages are considered to be
underlying logics of OWLs (Kr
¨
otzsch, 2012) which
are recommended by the W3C as a knowledge rep-
resentation and reasoning languages for the web. In
recent years, there has been a widespread interest in
Bigdata research. In particular, DL research commu-
nity has also shown an increaseing interest in studying
the problem of reasoning in DL KBs with Bigdata ap-
plications, see (M
¨
oller et al., 2013; Mutharaju et al.,
2010; Bellomarini et al., 2018; Urbani et al., 2010).
708
Sivaprakasam, G. and Slutzki, G.
Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm.
DOI: 10.5220/0008986007080716
In Proceedings of the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020) - Volume 2, pages 708-716
ISBN: 978-989-758-395-7; ISSN: 2184-433X
Copyright
c
2022 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
Tao et al., in (Tao et al., 2015) have developed
a conceptual framework to study Secrecy-Preserving
reasoning and Query Answering (SPQA) in Descrip-
tion Logic (DL) Knowledge Bases (KBs) under Open
World Assumptions (OWA). The approach uses the
notion of an envelope to hide secret information
against logical inference and it was first defined and
used in (Tao et al., 2010). The idea behind the enve-
lope concept is that no assertion in the envelope can
be logically deduced from information outside the en-
velope. This approach is based on the assumption that
the information contained in a KB is incomplete (by
OWA). Specifically, in (Tao et al., 2010; Tao et al.,
2015; Sivaprakasam and Slutzki, 2016; Sivaprakasam
and Slutzki, 2017) the main idea was to utilize the
secret information within the reasoning process, but
then answering “Unknown” whenever the answer is
truly unknown or in case the true answer could com-
promise confidentiality / secrecy.
In this paper, we study the SPQA problem on
very large scale ELH KB, see (Kr
¨
otzsch, 2012),
using MapReduce approach. MapReduce is a dis-
tributed programming paradigm for processing very
large data sets (Bigdata) in a distributed way over
cluster of machines (Dean and Ghemawat, 2008;
Karloff et al., 2010). The central idea behind this
procedure is to define map and reduce functions, and
compute the reduce function in a parallel manner. It
is a very popular computational model used in Big-
data applications, for more details see section 3.1.
A first step in designing SPQA system is to com-
pute a restricted assertional inference closure A
us-
ing MapReduce procedure. For this purpose, we
make use of non-modalized version of a tableau al-
gorithm whose proof of correctness is reported in
(Sivaprakasam and Slutzki, 2017). For each Abox ex-
pansion rule in this tableau algorithm, depending on
the keys, we define map and reduce functions. Each
map-reduce computational cycle results in a parallel
application of one of the expansion rules.
After computing A
, the next step is to compute
an envelope E to protect the secret information given
in the provided secrecy set S. This envelope is com-
puted by a another tableau algorithm based on the
idea of inverting the ABox expansion rules given in
the first tableau algorithm. For each of these inverted
expansion rule, the customized map and reduce func-
tions will be defined. Once such envelope is com-
puted, the answers to the queries are censored when-
ever the queries depend upon the set E. Since the
set A
\ E does not contain all the statements entailed
by Σ, we use a recursive query answering procedure
similar to a procedure reported in (Sivaprakasam and
Slutzki, 2016; Sivaprakasam and Slutzki, 2017) to an-
swer more complicated queries.
2 SYNTAX AND SEMANTICS
A vocabulary of E LH is a triple < N
O
,N
C
,N
R
>
of countably infinite, pairwise disjoint sets. The el-
ements of N
O
are called object (or individual) names,
the elements of N
C
are called concept names and the
elements of N
R
are called role names. The set of
ELH concepts is denoted by C and is defined by the
following rules
C ::= A | > | C u D | r.C
where A N
C
, r N
R
, > denotes the top concept”,
and C,D C . Assertions are expressions of the form
C(a) or r(a,b), general concept inclusions (GCIs) are
expressions of the form C v D and role inclusions are
expressions of the form r v s where C,D C , r,s N
R
and a, b N
O
. The semantics of E LH concepts is
specified, as usual, by an interpretation I =
,·
I
where is the domain of the interpretation, and ·
I
is an interpretation function mapping each a N
O
to
an element a
I
, each A N
C
to a subset A
I
,
and each r N
R
to a binary relation r
I
× . The
interpretation function ·
I
is extended inductively to all
ELH concepts in the usual manner:
>
I
= ; (C u D)
I
= C
I
D
I
;
(r.C)
I
= {d | e C
I
: (d,e) r
I
}.
An Abox A is a finite, non-empty set of assertions. A
TBox T is a finite set of GCIs and an RBox R is a
finite set of role inclusions. An ELH KB is a triple
Σ =
h
A,T ,R
i
where A is an ABox, T is a TBox and
R is an RBox. Let I =
,·
I
be an interpretation,
C, D C , r,s N
R
and a,b N
O
. We say that I satis-
fies C(a), r(a,b), C v D or r v s, notation I |= C(a),
I |= r(a, b), I |= C v D or I |= r v s if, respectively,
a
I
C
I
, (a
I
,b
I
) r
I
, C
I
D
I
or r
I
s
I
. I is a
model of Σ, notation I |= Σ, if I satisfies all the asser-
tions in A, all the GCIs in T and all the role inclusions
in R . Let α be an assertion, a GCI or a role inclusion.
We say that Σ entails α, notation Σ |= α, if all models
of Σ satisfy α.
3 COMPUTATION OF A MODEL
FOR ELH KB Σ AND A
Let Σ =
h
A,T ,R
i
be an ELH KB. Denote by N
Σ
the set of all concept names and role names occurring
in Σ and let S be a finite set
1
of concepts over the
1
A technicality; S will be used in section 4 in the context
of secrecy-preserving reasoning.
Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm
709
symbol set N
Σ
. Let C
Σ,S
be the set of all subconcepts
of concepts that occur in either S or Σ and define the
set of assertions
A
= {C(a) | C C
Σ,S
and Σ |= C(a)}∪
{r(a, b) | r occurs in Σ and Σ |= r(a,b)}.
We use O
Σ
to denote the set of individual names
that occur in Σ, and define a fresh set of constants
called as witness set W = {w
r
C
| r N
R
N
Σ
and C
C
Σ,S
}. This witness set plays an important role in de-
signing a rule in ABox reasoning algorithm that splits
an assertion of the form r.C(a) into two assertions
r(a, w
r
C
) and C(w
r
C
). Further define O
= O
Σ
W and
AX = {>(a) | a O
}.
u
+
-rule : if C(a), D(a) A
, C u D C
Σ,S
and
C u D(a) / A
,
then A
:= A
{C u D(a)};
u
-rule : if C u D(a) A
and
C(a) / A
or D(a) / A
,
then A
:= A
{C(a),D(a)};
+
-rule : if r(a,b), C(b) A
, r.C C
Σ,S
and
r.C(a) / A
,
then A
:= A
{∃r.C(a)};
-rule : if r.C(a) A
and
b O
, {r(a, b),C(b)} * A
,
then A
:= A
{r(a,w
r
C
),C(w
r
C
)},
where w
r
C
W ;
v -rule : if C(a) A
, C v D T and
D(a) / A
,
then A
:= A
{D(a)};
H-rule : if r(a,b) A
, r v s R and
s(a,b) / A
,
then A
:= A
{s(a,b)}.
Figure 1: ABox Tableau expansion rules.
Tableau algorithm given in Figure 1 matches ex-
actly with non-modalized (without modal operator)
version of the ABox tableau algorithm reported in
(Sivaprakasam and Slutzki, 2017). In that paper, the
authors discussed in detail the proof of correctness of
modalized version of ABox tableau algorithm. Given
Σ and C
Σ,S
, this algorithm computes A
. A
is initial-
ized as A AX and is expanded by exhaustively ap-
plying expansion rules listed in Figure 1. The worst
case running time of the algorithm given in Figure 1
is O((| Σ | + | C
Σ,S
|)
3
), for more details see Theorem
27 in (Kr
¨
otzsch, 2012).
3.1 MapReduce Paradigm
In this paper, our goal is to cast ABox reasoning al-
gorithm given in Figure 1 in MapReduce framework.
First let us give a brief account of MapReduce pro-
cedure. MapReduce is a distributed computational
model for processing data in parallel on clusters of
machines. Typically, these machines are called as
’nodes’, (Dean and Ghemawat, 2008; Karloff et al.,
2010). The data set is partitioned into several subsets,
and each subset of the data set is assigned to an idle
node. Each node has two major computational tasks
namely map and reduce functions computations. De-
pending upon the applications, a user can custom de-
fine the map and reduce functions. The abstract ver-
sion of the map and reduce functions is:
Map function
: The input for the map function is a
(key,value) pair, and output is a list of (key,value)
pairs. The formal representation of map function
is of the form
Map : (k,v) list(k
0
,v
0
)
.
Reduce function: This function collects values by
key, and then computes a value or a set of values
depending upon the user definitions. The general
representation of reduce function is of the form
Reduce : (k
00
,list(v
00
)) list(v
000
),
where k, k
0
and k
00
are the keys, and v,v
0
,v
00
and v
000
are the values.
Several off-the-shelf implementations are available
for the MapReduce procedure. The popular one is
Hadoop, see (Apache Software Foundation, 2010).
An interesting aspect of MapReduce approach is the
system level issues like fixing the faults and distribu-
tion of data to the nodes are taken care of by the un-
derlying implementation. The programmer needs to
just define the map and reduce functions, see (Karloff
et al., 2010; Mutharaju et al., 2010).
3.2 MapReduce for ABox Reasoning
In this section, we will explain in more detail how
the completion rules given in Figure 1 can be refor-
mulated to a format that is suitable for MapReduce
framework. As explained in the previous section, the
important computational tasks in MapReduce frame-
work are computing map and reduce functions. Infor-
mally the MapReduce framework, in the context of
completion rules for reasoning, works as follows: (a)
based on the preconditions (ignoring the precondition
which makes sure that the duplicate inferences are
not computed), the map function identifies a key and
(b) the reduce function, based on the key, completes
ICAART 2020 - 12th International Conference on Agents and Artificial Intelligence
710
the application of the rule. To illustrate this idea, we
choose the v- rule given in Figure 1. The precondi-
tions in this rule are C(a) A
and C D T . The
common concept among these preconditions is C, and
C can be used as a key. So the map function, in the
case of v- rule, computes the pairs (C,C(a) A) and
(C,C v D T ).
Expansion Rule Key
u
+
aux
-rule : if C(a) A
and C
C u D C
Σ,S
, then
T
1
:= T
1
{(C u D,C(a))};
u
+
-rule : if D(a) A
and D
(C u D,C(a)) T
1
,
then A
:= A
{C u D(a)};
u
-rule : if C u D(a) A
, then C u D
A
:= A
{C(a),D(a)};
+
aux
-rule : if r(a,b) A
and r
r.C C
Σ,S
, then
T
2
:= T
2
{(r.C,r(a, b))};
+
-rule : if C(b) A
and C
(r.C, r(a,b)) T
2
,
then A
:= A
{∃r.C(a)};
-rule : if r.C(a) A
, then r.C
A
:= A
{r(a,w
r
C
),C(w
r
C
)},
where w
r
C
W ;
v -rule : if C(a) A
and C v D T , C
then A
:= A
{D(a)};
H-rule : if r(a,b) A
and r v s R , r
then A
:= A
{s(a,b)}.
Figure 2: Reformulated ABox expansion rules.
Now, the reduce function collects pairs with the
same key and completes the application of v- rule
which means the assertion D(a) is added to the set
A
. It is straightforward to cast u
-,
-, v- and
H- rules directly into MapReduce format. To mod-
ify u
+
and
+
- rules in MapReduce format, we use
a two step method which is similar to the one used
in (Mutharaju et al., 2010; Mutharaju, 2016). The
u
+
- rule displayed in Figure 1 is split into two rules
namely u
+
aux
and u
+
which are given in Figure 2.
u
+
aux
-rule computes a set T
1
whose elements are not
elements of the set A
which is the output of the rea-
soning algorithm. The purpose of computing the set
T
1
is to provide necessary information to the reduce
function about the conjuncts and the object involved
so that reduce function can compute conjunction of
two concept assertions with the same object. Each el-
ement of the set T
1
is a pair whose first part contains
information about the conjuncts and the second part
contains information about the object.
Similarly, the
+
- rule displayed in Figure 1 is
split into two rules namely
+
aux
and
+
which are
given in Figure 2.
+
aux
computes a set T
2
whose el-
ements are not elements of the set A
. The elements
of the set T
2
provide information to the reduce func-
tion so that it computes a concept assertion involving
‘there exists’ constructor. Each element of the set T
2
is a pair whose first part contains information about
the concept and the second part contains information
about the objects. The u
+
- and
+
- rules given in Fig-
ure 2 compute the respective ABox assertions. The
sets T
1
and T
2
are initialized as
/
0. Note that the result
of the completion rules u
+
aux
and
+
aux
do not follow
syntax of ELH language.
Each element of T
1
is a pair of preconditions of
u
+
aux
- rule. This rule is of the form p q p q,
where p and q are propositional variables, which is a
tautology. Therefore, u
+
aux
- rule is sound. u
+
- rule in
Figure 2 simulates exactly the same result of the u
+
-
rule in Figure 1. The same argument holds true for the
+
aux
- and
+
- rules. The application of
- rule on
the precondition r.C(a) would result in adding the
assertions r(a,w
r
C
) and C(w
r
C
) into the set A
even in
the case that the assertions r(a,b) and C(b) already
available in A
for some witness b O
Σ
. Also
-
rule is sound. The remaining all other rules in Fig-
ure 2 are same as the rules in Figure 1 except for the
preconditions which guarantee no duplication in the
output. The reformulated ABox reasoning algorithm
given in Figure 2 terminates if no application of any
of the rule given in Figure 2 is applicable. Every new
addition of ABox assertion into A is entailed by Σ.
Since the ABox reasoning algorithm in Figure 1 is
sound, complete and terminate so as the algorithm in
Figure 2. Hence the reformulated algorithm in Fig-
ure 2 is correct, and the worst case running time of
this ABox algorithm is cubic polynomial in the size
of input KB and the set C
Σ,S
.
3.3 Parallelization using MapReduce
for ABox Reasoning
In this section, we discuss how parallelization is
implemented in ABox reasoning procedure using
MapReduce approach. We consider the completion
rules given in Figure 2 with its respective keys which
are already reformulated in a way that they are suit-
able for implementation of parallelization method.
Here the inputs are the KB Σ = hA,T , R i, C
Σ,S
, T
1
and T
2
. The set A
is initialized as A and is expanded
Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm
711
using the following strategy: The completion rules are
applied iteratively so that in a particular iteration, one
rule is applied. In a given iteration, the elements of
the sets A, T , R , C
Σ,S
, T
1
and T
2
are partitioned into
subsets. Each subset is distributed to different com-
puting nodes. Each node computes first map function
and then computes reduce function. Each map-reduce
computational cycle results in the parallel application
of one of the completion rules. In the map compu-
tational step, based on the rule chosen for the appli-
cation, the elements from either the set A
or from
the sets A
and C
Σ,S
or from the sets A
and T or
from the sets A
and R which satisfy any of the pre-
conditions of the rule are selected and, the relevant
(key,value) pairs are computed. Here, key is a con-
cept or role name common to the preconditions of the
rule and the value is the preconditions itself.
Algorithm 1: MapReduce algorithm for the rule u
.
map(key,value)
/* key: line number (not relevant)*/
/* value: an assertion from ABox A
*/
if value == C u D(a) A
then
return (C u D,C u D(a) A
)
end if
reduce(key,values)
/* key: C u D (concept)*/
/* values: assertions from ABox A
*/
for all v in values do
if v == C u D(a) A
then
return C(a), D(a) A
end if
end for
In the reduce computational step, all the pairs hav-
ing the same key are collected from different nodes
and the results of the completion rule are obtained.
During this reduce computational step, all valid com-
binations of values are taken onto account. At the end
of each iteration, all the duplicate assertions are re-
moved from the set A
. Iterations (map-reduce cycle)
are continued until no further new addition of asser-
tions into A
. That is, the set A
obtained at the end
of any two consecutive iterations remains same. The
strategy discussed here is similar to the strategies pre-
sented in (Mutharaju et al., 2010; Mutharaju, 2016).
Map and reduce functions of u
and v reformu-
lated ABox completion rules given in Figure 2 are
defined in Algorithm 1 and Algorithm 2 respectively.
For the remaining rules in Figure 2, map and reduce
functions can be defined in a similar way. Algorithm
1 describes map and reduce functions for the u
-rule.
The input for the map function is an element of A
.
The output of the map function is a list of relevant
(key,value) pairs. A list of values of same key is ac-
cepted by the reduce function. At this point of time,
every possible combination of values is tested to make
sure the u
-rule is applicable. The output of the re-
duce function is a list of elements which is added to
the set A
. Algorithm 2 describes map and reduce
functions for the v-rule. Since this algorithm is sim-
ilar to Algorithm 1, the discussion on Algorithm 2 is
omitted.
Algorithm 2: MapReduce algorithm for the rule v.
map(key,value)
/* key: line number (not relevant)*/
/* value: an assertion from ABox A
or an ax-
iom from TBox T */
if value == C(a) A
then
return (C,C(a) A
)
else if value == C v D T then
return (C,C v D T )
end if
reduce(key,values)
/* key: C (concept)*/
/* values: assertions from ABox A
or axioms
from TBox T */
for all v
1
in values do
for all v
2
in values do
if v
1
== C(a) A
and v
2
== C v D T
then
return D(a) A
end if
end for
end for
As indicated in (Mutharaju, 2016), the algorithm
presented in this section can compute duplicate infor-
mation. In each map-reduce computation cycle, du-
plicate information will be removed. Removal of the
duplicates from the set A
incurs additional computa-
tional cost which affects the performance of the algo-
rithm. The performance of this algorithm can be opti-
mized by using the distributed computational models
reported in (Mutharaju, 2016; Urbani et al., 2010).
4 SECRECY-PRESERVING
REASONING
Let Σ =
h
A,T ,R
i
be an ELH KB. Also let S
A
\ AX be the “secrecy set” to be protected from
the querying agent. Given Σ and S, the objective
is to answer assertion queries while preserving se-
crecy. Our approach is to compute a set E, where
S E A
\ AX, called the secrecy envelope for S,
ICAART 2020 - 12th International Conference on Agents and Artificial Intelligence
712
so that protecting E the querying agent cannot logi-
cally infer any assertion in S, see (Tao et al., 2015).
We briefly explain the role of OWA in answering the
queries and how it helps protecting the secrets. When
answering a query with “Unknown”, the querying
agent should not be able to distinguish between the
case that the answer to the query is truly unknown to
the KB reasoner and the case that the answer is be-
ing protected for reasons of secrecy. We envision a
situation in which once the ABox A
is computed,
a reasoner R is associated with it. R is designed to
answer queries as follows: If a query cannot be in-
ferred from Σ, the answer is “Unknown”. If it can be
inferred and it is not in E, the answer is “Yes”; other-
wise, the answer is “Unknown”. Note that since the
syntax of ELH does not include negation, an ELH
KB cannot entail a negative query.
We make the following assumptions about the ca-
pabilities of the querying agent:
(a) It does not have direct access to the KB Σ, but is
aware of the underlying vocabulary,
(b) It does not know about witness set W ,
(c) It can ask queries in the form of assertions, and
(d) It cannot ask queries in the form of general con-
cept or role inclusions.
We formally define the notion of an envelope in
the following:
Definition 1. Let Σ =
h
A,T ,R
i
be a ELH KB, and
let S be a finite secrecy set. The secrecy envelope E of
S have the following properties:
- S E A
\ AX, and
- for every α E, A
\ E 6|= α.
The intuition for the above definition is that no in-
formation in E can be inferred from the set A
\ E.
To compute an envelope, we use the idea of inverting
the rules of Figure 1 as given in (Tao et al., 2010; Tao
et al., 2015). Induced by the ABox expansion rules
in Figures 1, we define the corresponding “inverted”
ABox expansion rules in Figure 3. These inverted
expansion rules are denoted by prefixing Inv- to the
name of the corresponding expansion rules. Note that
the
–rule does not have its corresponding inverted
rule. The reason for the ”omission” is that an applica-
tion of this rule results in adding assertions with indi-
vidual names from the witness set which the querying
agent is barred from asking about.
The envelope E is computed by initializing it to
S and then expanding it using the inverted expansion
rules listed in Figure 3 until no further applications are
possible. We denote by Λ
S
the algorithm which com-
putes the set E. Due to non-determinism in applying
the rules Inv-u
+
and Inv-
+
, different executions of
Λ
S
may output different envelopes. Since A
is finite,
the computation of Λ
S
terminates. Let E be an output
of Λ
S
. Since the size of A
is cubic polynomial in
| Σ |+| C
Σ,S
|, and each application of inverted expan-
sion rule moves some assertions from A
into E, the
size of E is at most the size of A
. Therefore, to com-
pute the envelope E, Λ
S
takes O((| Σ | + | C
Σ,S
|)
3
).
The proof of correctness of Λ
S
is omitted.
Inv- u
+
-rule : if C u D(a) E, C u D C
Σ,S
and
{C(a),D(a)} A
\ E,
then E := E {C(a)} or
E := E {D(a)};
Inv- u
-rule : if {C(a),D(a)} E 6=
/
0 and
C u D(a) A
\ E,
then E := E {C u D(a)};
Inv-
+
-rule : if r.C(a) E, r.C C
Σ,S
and
{r(a, b),C(b)} A
\ E
with b O
,
then E := E {r(a,b)} or
E := E {C(b)};
Inv- v -rule : if D(a) E, C v D T and
C(a) A
\ E,
then E := E {C(a)};
Inv-H-rule : if s(a, b) E, r v s R and
r(a, b) A
\ E,
then E := E {r(a,b)}.
Figure 3: Inverted Tableau expansion rules.
4.1 MapReduce for Computing an
Envelope E
An important step in computing a secrecy envelope
for a given secrecy set using the MapReduce frame-
work is to cast the inverted ABox tableau expansion
rules given in Figure 3 to a format that is suitable for
MapReduce procedure. Since a detailed discussion
of various computational tasks in MapReduce frame-
work in the context of completion rules for reasoning
is given in section 3.2, we present a short explanation
about the modified inverted rules given in Figure 4. To
modify each rule in Figure 3 except for the rule Inv-
u
in MapReduce format, we use a two step method
as explained in detail in section 3.2. The Inv-u
+
- rule
displayed in Figure 3 is split into two rules namely
Inv-u
+
aux
and Inv-u
+
given in Figure 4. Inv-u
+
aux
-
rule computes a set T
3
whose elements are not ele-
ments of the set E. The purpose of computing the set
T
3
is to provide necessary information to the reduce
function of the rule Inv-u
+
. Using this information,
Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm
713
the reduce function computes an assertion that goes in
to E corresponding to a secret involving conjunction
constructor. Otherwise, that secret may be revealed.
Similarly, each inverted expansion rules Inv-
+
,
Inv-v and Inv-H displayed in Figure 3 is split into a
pair of rules. The first rule in each of these pairs of
rules is an auxiliary rule Inv-
+
aux
, Inv-v
aux
and Inv-
H
aux
which computes respectively a set T
4
, T
5
and
T
6
. As explained in the previous paragraph, each of
these sets provide information to the reduced function
Inverted Expansion Rule Key
Inv-u
+
aux
: if C(a) A
\ E and C
C u D(a) E, then
T
3
:= T
3
{(C u D(a),C(a))};
Inv-u
+
: if D(a) A
\ E and D
(C u D(a),C(a)) T
3
,
then E := E {C(a)} or
E := E {D(a)};
Inv-u
L
: if C u D(a) A
\ E and C
C(a) E, then
E := E {C u D(a)};
Inv-u
R
: if C u D(a) A
\ E and D
D(a) E, then
E := E {C u D(a)};
Inv-
+
aux
: if r(a, b) A
\ E and r
r.C(a) E, then
T
4
:= T
4
{(r.C(a),r(a,b))};
Inv-
+
: if C(b) A
\ E and C
(r.C(a), r(a,b)) T
4
,
then E := E {r(a,b)} or
E := E {C(b)};
Inv- v
aux
: if C(a) A
\ E and C
C v D T , then
T
5
:= T
5
{(C v D,C(a))};
Inv- v: if D(a) E and D
(C v D,C(a)) T
5
,
then E := E {C(a)};
Inv-H
aux
: if r(a, b) A
\ E and r
r v s R , then
T
6
:= T
6
{(r v s,r(a, b))};
Inv-H : if s(a,b) E and s
(r v s,r(a,b)) T
6
,
then E := E {r(a,b)}.
Figure 4: Reformulated Inverted Tableau expansion rules.
of second rule in the respective pairs. The Inv-u
+
,
Inv-
+
, Inv-v and Inv-H- rules given in Figure 4
compute assertions that belong to set E. The sets T
3
,
T
4
, T
5
and T
6
are initialized as
/
0. Note that the result
of the completion rules u
+
aux
,
+
aux
, v
aux
and H
aux
do
not follow syntax of ELH language.
The intuition behind the rule Inv-u
given in Fig-
ure 3 is that if either C(a) or D(a) is in E, then
C u D(a) should be in E. That is, to protect the secret
information C(a) or D(a), we need to add C u D(a)
into the set E. To cast this rule in MapReduce format,
we split it into two rules Inv-u
L
and Inv-u
R
. Each
rule computes the same assertion involving conjunc-
tion constructor. Since at the end of each map-reduce
cycle, the redundant information are removed, the fi-
nal output E of the tableau algorithm given in Figure
4 is free from duplicate assertions.
As discussed in section 3.2, all the auxiliary rules
in Figure 4 are of the form p q p q, where p
and q are propositional variables, which is a tautol-
ogy. Therefore, all the auxillary rules are sound. The
remaining rules in Figure 4 simulate exactly the same
result of the rules in Figure 3. Since the tableau algo-
rithm given in Figure 3 is correct so as the algorithm
given in Figure 4. Hence the reformulated inverted
tableau algorithm in Figure 4 is correct,
4.2 Parallelization using MapReduce
for Envelope Computation
A brief account of design and implementation of map
and reduce functions for each reformulated inverted
completion rules given in Figure 4 is discussed in this
section, for more details see section 3.3. We con-
sider the inverted expansion rules with its respective
keys which are already reformulated in a way that is
suitable for implementation of parallelization method.
The inputs are the KB Σ = hA
,T , R i, C
Σ,S
, S, T
3
,
T
4
. T
5
and T
6
. The set E is initialized as S and is
expanded using the following strategy. The comple-
tion rules are applied iteratively so that in a particular
iteration, one rule is applied. In a given iteration, the
elements of the sets A
, T , R , C
Σ,S
, S, T
3
, T
4
, T
5
and T
6
are partitioned into subsets. Each subset is
distributed to different computing nodes. Each node
computes first map function and then computes re-
duce function. Each map-reduce computational cycle
results in the parallel application of one of the rules.
At the end of each map-reduce cycle, all the duplicate
assertions are removed from the set E. Map-reduce
cycle are continued until no further new addition of
assertions into E. That is, the set E obtained at the
end of any two consecutive iterations remains same.
ICAART 2020 - 12th International Conference on Agents and Artificial Intelligence
714
Algorithm 3: MapReduce algorithm for Inv-u
L
.
map(key,value)
/* key: line number (not relevant)*/
/* value: an assertion from ABox A
\ E or an
assertion from E */
if value == C u D(a) A
\ E then
return (C,C u D(a) A
\ E)
else if value == C(a) E then
return (C,C(a))
end if
reduce(key,values)
/* key: C (concept)*/
/* values: assertions from ABox A
\ E or as-
sertions from the set E*/
for all v
1
in values do
for all v
2
in values do
if v
1
== C u D(a) A
\ E and v
2
== C(a)
then
return C u D(a) E
end if
end for
end for
Algorithm 4: MapReduce algorithm for Inv-v
aux
.
map(key,value)
/* key: line number (not relevant)*/
/* value: an assertion from ABox A
\ E or an
axiom from TBox T */
if value == C(a) A
\ E then
return (C,C(a) A
\ E)
else if value == C v D T then
return (C,C v D)
end if
reduce(key,values)
/* key: C (concept)*/
/* values: assertions from ABox A
or axioms
from TBox T */
for all v
1
in values do
for all v
2
in values do
if v
1
==C(a) A
\E and v
2
==C v D then
return (C v D,C(a)) T
5
end if
end for
end for
Map and reduce functions for Inv-u
L
, Inv-v
aux
and Inv-v reformulated inverted completion rules
given in Figure 4 are defined in Algorithm 3 through
Algorithm 5. For the remaining rules in Figure 4, map
and reduce functions can be defined in a similar way.
Algorithm 3 describes map and reduce functions for
the u
L
-rule. The input for the map function is an ele-
ment of A
\ E and an element in E. The output of
Algorithm 5: MapReduce algorithm for Inv-v.
map(key,value)
/* key: line number (not relevant)*/
/* value: an assertion in E or an element in the
set T
5
*/
if value == D(a) E then
return (D, D(a) E)
else if value == (C v D,C(a)) T
5
then
return (D, (C v D,C(a)))
end if
reduce(key,values)
/* key: C (concept)*/
/* values: assertions from the set E or elements
from the set T
5
*/
for all v
1
in values do
for all v
2
in values do
if v
1
== D(a) E and v
2
== (C v D,C(a))
then
return C(a) E
end if
end for
end for
the map function is a list of relevant (key,value) pairs.
A list of values of same key is accepted by the reduce
function. At this point of time, every possible combi-
nation of values is tested to make sure the u
L
-rule is
applicable. The output of the reduce function is a list
of elements which is added to the set E. The map and
reduce functions of remaining algorithms are similar
to Algorithm 3 and can be explained in the same way.
As discussed in section 3.3, the procedure that com-
putes envelope can output duplicate information. At
the end of each map-reduce computation cycle, du-
plicate assertions will be removed. Removal of the
duplicates from the set E incurs additional computa-
tional cost which affects the performance of this com-
putational procedure. The performance of this proce-
dure can be improved by using the distributed compu-
tational models reported in (Mutharaju, 2016; Urbani
et al., 2010).
5 SUMMARY
In this paper we have studied the problem of secrecy-
preserving reasoning in ELH KBs using MapRe-
duce framework. Let Σ =
h
A,T ,R
i
be an ELH KB
and assume that the size of the set A is very large.
Our main contribution in this paper is to use MapRe-
duce procedure within reasoning algorithms to com-
pute a finite set of assertional consequences A
and
an envelope E which is super set of the secrecy set
S. To the best of our knowledge, secrecy-preserving
Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm
715
reasoning with MapReduce framework has not been
studied before. For the query answering part, we as-
sume that the sets A
and E are precomputed. The
set A
\ E is free from all the secret information,
and no secret information can be inferred from it.
The queries from the querying agent are answered
based on the information available in the set A
\ E.
Note that A
\ E is finite and does not contain an-
swer for all the non-confidential queries. A recur-
sive query answering procedure is used to answer the
non-consequential queries. For this purpose,we adopt
the recursive query answering procedure reported in
(Sivaprakasam and Slutzki, 2017) with the necessary
changes. Our main emphasis in this paper is how
to use MapReduce framework within the reasoning
procedures to study the secrecy-preserving reasoning
problem. The implementation of this query answering
procedure in Hadoop tool (Apache Software Founda-
tion, 2010) will be considered in our future work. Fur-
ther, we will implement SPQA framework for a single
querying agent in Protege (Protege - Stanford Univer-
sity, 1999), a knowledge representation and reasoning
tool for ELH KBs. To study the performance of im-
plementation of SPQA framework for single query-
ing agent using MapReduce procedure, we will con-
duct experiments in very large ontologies SNOMED
CT and GALEN (Dentler et al., 2011) in both Protege
and Hadoop tools and compare their performances.
REFERENCES
Apache Software Foundation (2010). http://hadoop.
apache.org.
Bellomarini, L., Gottlob, G., Pieris, A., and Sallinger, E.
(2018). Swift logic for big data and knowledge graphs.
In International Conference on Current Trends in The-
ory and Practice of Informatics, pages 3–16. Springer.
Dean, J. and Ghemawat, S. (2008). Mapreduce: simplified
data processing on large clusters. Communications of
the ACM, 51(1):107–113.
Dentler, K., Cornet, R., Ten Teije, A., and De Keizer, N.
(2011). Comparison of reasoners for large ontologies
in the owl 2 el profile. Semantic Web, 2(2):71–87.
Hitzler, P. and Janowicz, K. (2013). Linked data, big data,
and the 4th paradigm. Semantic Web, 4(3):233–235.
Karloff, H., Suri, S., and Vassilvitskii, S. (2010). A model
of computation for mapreduce. In Proceedings of
the twenty-first annual ACM-SIAM symposium on Dis-
crete Algorithms, pages 938–948. SIAM.
Konstantopoulos, S., Charalambidis, A., Mouchakis, G.,
Troumpoukis, A., Jakobitsch, J., and Karkaletsis, V.
(2016). Semantic web technologies and big data in-
frastructures: Sparql federated querying of heteroge-
neous big data stores. In International Semantic Web
Conference (Posters & Demos).
Kr
¨
otzsch, M. (2012). Owl 2 profiles: An introduction to
lightweight ontology languages. In Reasoning Web In-
ternational Summer School, pages 112–183. Springer.
M
¨
oller, R., Neuenstadt, C.,
¨
Ozc¸ep,
¨
O. L., and Wandelt, S.
(2013). Advances in accessing big data with expres-
sive ontologies. In Annual Conference on Artificial
Intelligence, pages 118–129. Springer.
Mutharaju, R. (2016). Distributed rule-based ontology rea-
soning. PhD thesis, Wright State University.
Mutharaju, R., Maier, F., and Hitzler, P. (2010). A mapre-
duce algorithm for EL
+
. In 23rd International Work-
shop on Description Logics DL2010, volume 456.
Protege - Stanford University (1999). http://protege.
stanford.edu.
Reinsel, D., Gantz, J., and Rydning, J. (2017). Data age
2025.
Sivaprakasam, G. K. and Slutzki, G. (2016). Secrecy-
preserving query answering in ELH knowledge bases.
In Proceedings of the 8th International Conference
on Agents and Artificial Intelligence (ICAART 2016),
Volume 2, Rome, Italy, February 24-26., pages 149–
159.
Sivaprakasam, G. K. and Slutzki, G. (2017). Keeping se-
crets in modalized DL knowledge bases. In Proceed-
ings of the 9th International Conference on Agents
and Artificial Intelligence, ICAART 2017, Volume 2,
Porto, Portugal, February 24-26., pages 591–598.
Tao, J., Slutzki, G., and Honavar, V. (2010). Secrecy-
preserving query answering for instance checking in
EL. In International Conference on Web Reasoning
and Rule Systems, pages 195–203. Springer.
Tao, J., Slutzki, G., and Honavar, V. (2015). A concep-
tual framework for secrecy-preserving reasoning in
knowledge bases. ACM Transactions on Computa-
tional Logic (TOCL), 16(1):3.
Urbani, J., Kotoulas, S., Maassen, J., Van Harmelen, F., and
Bal, H. (2010). Owl reasoning with webpie: calcu-
lating the closure of 100 billion triples. In Extended
Semantic Web Conference, pages 213–227. Springer.
Zhou, Y., Cuenca Grau, B., Horrocks, I., Wu, Z., and Baner-
jee, J. (2013). Making the most of your triple store:
query answering in owl 2 using an rl reasoner. In
Proceedings of the 22nd international conference on
World Wide Web, pages 1569–1580. ACM.
ICAART 2020 - 12th International Conference on Agents and Artificial Intelligence
716