Secrecy-preserving Reasoning in ELH Knowledge Bases using

MapReduce Algorithm

Gopalakrishnan Krishnasamy Sivaprakasam

and Giora Slutzki

Department of Mathematics & Computer Science, Central State University, Wilberforce, Ohio, U.S.A.

Department of Computer Science, Iowa State University, Ames, Iowa, U.S.A.

Keywords:

Secrecy Preserving Reasoning, Knowledge Bases, MapReduce Algorithm.

Abstract:

In this paper, we have used the MapReduce algorithm to study the problem of secrecy-preserving reasoning

in a very large ELH knowledge bases. A tableau algorithm for ABox reasoning is designed in a way that is

suitable for MapReduce framework and contains a small set of reasoning rules. To implement the paralleliza-

tion method, we have designed map and reduce functions for each of these ABox reasoning rules. The output

of this computational procedure is a ﬁnite set A

∗

which contains assertional consequences of the given knowl-

edge base. Given a ﬁnite secrecy set S, we compute a set E, called an envelope of S, which provides logical

protection to the secrecy set S against the reasoning of a querying agent. To compute E, a tableau algorithm

is designed by inverting ABox reasoning rules in a way that is suitable for MapReduce framework. Further, to

implement the parallelization method, we have designed map and reduce functions for each of these inverted

rules.

1 INTRODUCTION

In the last three decades, beginning with the internet

era, there has been a dramatic increase in the volume

of mobile devices and internet users whose daily web

activities include online banking activities, social net-

working, web based travel services and other inter-

net based business applications. These activities con-

tribute to the generation of unprecedented amount of

web data. In (Reinsel et al., 2017), the authors re-

ported that estimated amount of web data would be

around 17 zettabytes and they forecasted that the vol-

ume of web data would grow upto 160 zettabytes by

2025. Web data contains a massive amount of pri-

vate information about users, administrators, service

providers, governmental agencies and military estab-

lishments which must be maintained and protected.

To deal with this emerging challenge it is imperative

to design and develop robust Bigdata technology tools

to secure the conﬁdential information in ways that do

not deter the main web objective of sharing the non-

conﬁdential information among users.

The logic based languages like Resource De-

scription Framework (RDF) and Web Ontology Lan-

guages (OWL) have been developed exclusively for

creating knowledge bases (ontologies) and design-

ing efﬁcient procedures to reasoning and answering

queries on these web databases. RDF and OWL

are prescribed languages for Semantic web tech-

nologies recommended by World Wide Web consor-

tium (W3C). Recently, several software tools like

Bigdata

, LargeTripleStores and Blazegraph

have

been developed speciﬁcally for Bigdata applications,

and these tools support RDF and OWL formats. In

addition, a number of research works have been re-

ported on Big data applications in Semantic Web us-

ing RDF and OWL approaches, see (Konstantopoulos

et al., 2016; Zhou et al., 2013; Hitzler and Janowicz,

2013).

Description Logic (DL) is a decidable fragment of

First Order Logic (FOL). In Semantic web research,

DL languages play a central role in building Knowl-

edge Bases (KBs) for social, biological and medi-

cal applications, and reasoning and answering queries

over KBs. Further, DL languages are considered to be

underlying logics of OWLs (Kr

otzsch, 2012) which

are recommended by the W3C as a knowledge rep-

resentation and reasoning languages for the web. In

recent years, there has been a widespread interest in

Bigdata research. In particular, DL research commu-

nity has also shown an increaseing interest in studying

the problem of reasoning in DL KBs with Bigdata ap-

plications, see (M

oller et al., 2013; Mutharaju et al.,

2010; Bellomarini et al., 2018; Urbani et al., 2010).

708

Sivaprakasam, G. and Slutzki, G.

Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm.

DOI: 10.5220/0008986007080716

In Proceedings of the 12th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2020) - Volume 2, pages 708-716

ISBN: 978-989-758-395-7; ISSN: 2184-433X

Tao et al., in (Tao et al., 2015) have developed

a conceptual framework to study Secrecy-Preserving

reasoning and Query Answering (SPQA) in Descrip-

tion Logic (DL) Knowledge Bases (KBs) under Open

World Assumptions (OWA). The approach uses the

notion of an envelope to hide secret information

against logical inference and it was ﬁrst deﬁned and

used in (Tao et al., 2010). The idea behind the enve-

lope concept is that no assertion in the envelope can

be logically deduced from information outside the en-

velope. This approach is based on the assumption that

the information contained in a KB is incomplete (by

OWA). Speciﬁcally, in (Tao et al., 2010; Tao et al.,

2015; Sivaprakasam and Slutzki, 2016; Sivaprakasam

and Slutzki, 2017) the main idea was to utilize the

secret information within the reasoning process, but

then answering “Unknown” whenever the answer is

truly unknown or in case the true answer could com-

promise conﬁdentiality / secrecy.

In this paper, we study the SPQA problem on

very large scale ELH KB, see (Kr

otzsch, 2012),

using MapReduce approach. MapReduce is a dis-

tributed programming paradigm for processing very

large data sets (Bigdata) in a distributed way over

cluster of machines (Dean and Ghemawat, 2008;

Karloff et al., 2010). The central idea behind this

procedure is to deﬁne map and reduce functions, and

compute the reduce function in a parallel manner. It

is a very popular computational model used in Big-

data applications, for more details see section 3.1.

A ﬁrst step in designing SPQA system is to com-

pute a restricted assertional inference closure A

∗

us-

ing MapReduce procedure. For this purpose, we

make use of non-modalized version of a tableau al-

gorithm whose proof of correctness is reported in

(Sivaprakasam and Slutzki, 2017). For each Abox ex-

pansion rule in this tableau algorithm, depending on

the keys, we deﬁne map and reduce functions. Each

map-reduce computational cycle results in a parallel

application of one of the expansion rules.

After computing A

∗

, the next step is to compute

an envelope E to protect the secret information given

in the provided secrecy set S. This envelope is com-

puted by a another tableau algorithm based on the

idea of inverting the ABox expansion rules given in

the ﬁrst tableau algorithm. For each of these inverted

expansion rule, the customized map and reduce func-

tions will be deﬁned. Once such envelope is com-

puted, the answers to the queries are censored when-

ever the queries depend upon the set E. Since the

set A

∗

\ E does not contain all the statements entailed

by Σ, we use a recursive query answering procedure

similar to a procedure reported in (Sivaprakasam and

Slutzki, 2016; Sivaprakasam and Slutzki, 2017) to an-

swer more complicated queries.

2 SYNTAX AND SEMANTICS

A vocabulary of E LH is a triple < N

of countably inﬁnite, pairwise disjoint sets. The el-

ements of N

are called object (or individual) names,

the elements of N

are called concept names and the

elements of N

are called role names. The set of

ELH concepts is denoted by C and is deﬁned by the

following rules

C ::= A | > | C u D | ∃r.C

where A ∈ N

, r ∈ N

, > denotes the “top concept”,

and C,D ∈ C . Assertions are expressions of the form

C(a) or r(a,b), general concept inclusions (GCIs) are

expressions of the form C v D and role inclusions are

expressions of the form r v s where C,D ∈ C , r,s ∈ N

and a, b ∈ N

. The semantics of E LH concepts is

speciﬁed, as usual, by an interpretation I =



∆,·



where ∆ is the domain of the interpretation, and ·

is an interpretation function mapping each a ∈ N

an element a

∈ ∆, each A ∈ N

to a subset A

⊆ ∆,

and each r ∈ N

to a binary relation r

⊆ ∆ × ∆. The

interpretation function ·

is extended inductively to all

ELH concepts in the usual manner:

= ∆; (C u D)

= C

∩ D

;

(∃r.C)

= {d ∈ ∆ | ∃e ∈ C

: (d,e) ∈ r

An Abox A is a ﬁnite, non-empty set of assertions. A

TBox T is a ﬁnite set of GCIs and an RBox R is a

ﬁnite set of role inclusions. An ELH KB is a triple

Σ =

A,T ,R

where A is an ABox, T is a TBox and

R is an RBox. Let I =



∆,·



be an interpretation,

C, D ∈ C , r,s ∈ N

and a,b ∈ N

. We say that I satis-

ﬁes C(a), r(a,b), C v D or r v s, notation I |= C(a),

I |= r(a, b), I |= C v D or I |= r v s if, respectively,

∈ C

, (a

) ∈ r

, C

⊆ D

or r

⊆ s

. I is a

model of Σ, notation I |= Σ, if I satisﬁes all the asser-

tions in A, all the GCIs in T and all the role inclusions

in R . Let α be an assertion, a GCI or a role inclusion.

We say that Σ entails α, notation Σ |= α, if all models

of Σ satisfy α.

3 COMPUTATION OF A MODEL

FOR ELH KB Σ AND A

∗

Let Σ =

A,T ,R

be an ELH KB. Denote by N

the set of all concept names and role names occurring

in Σ and let S be a ﬁnite set

of concepts over the

A technicality; S will be used in section 4 in the context

of secrecy-preserving reasoning.

Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm

709

symbol set N

. Let C

Σ,S

be the set of all subconcepts

of concepts that occur in either S or Σ and deﬁne the

set of assertions

∗

= {C(a) | C ∈ C

Σ,S

and Σ |= C(a)}∪

{r(a, b) | r occurs in Σ and Σ |= r(a,b)}.

We use O

to denote the set of individual names

that occur in Σ, and deﬁne a fresh set of constants

called as witness set W = {w

| r ∈ N

∩ N

and C ∈

Σ,S

}. This witness set plays an important role in de-

signing a rule in ABox reasoning algorithm that splits

an assertion of the form ∃r.C(a) into two assertions

r(a, w

) and C(w

). Further deﬁne O

∗

= O

∪W and

AX = {>(a) | a ∈ O

∗

-rule : if C(a), D(a) ∈ A

∗

, C u D ∈ C

Σ,S

and

C u D(a) /∈ A

∗

then A

∗

:= A

∗

∪ {C u D(a)};

−

-rule : if C u D(a) ∈ A

∗

and

C(a) /∈ A

∗

or D(a) /∈ A

∗

then A

∗

:= A

∗

∪ {C(a),D(a)};

∃

-rule : if r(a,b), C(b) ∈ A

∗

, ∃r.C ∈ C

Σ,S

and

∃r.C(a) /∈ A

∗

then A

∗

:= A

∗

∪ {∃r.C(a)};

∃

−

-rule : if ∃r.C(a) ∈ A

∗

and

∀b ∈ O

∗

, {r(a, b),C(b)} * A

∗

then A

∗

:= A

∗

∪ {r(a,w

),C(w

)},

where w

∈ W ;

v -rule : if C(a) ∈ A

∗

, C v D ∈ T and

D(a) /∈ A

∗

then A

∗

:= A

∗

∪ {D(a)};

H-rule : if r(a,b) ∈ A

∗

, r v s ∈ R and

s(a,b) /∈ A

∗

then A

∗

:= A

∗

∪ {s(a,b)}.

Figure 1: ABox Tableau expansion rules.

Tableau algorithm given in Figure 1 matches ex-

actly with non-modalized (without modal operator)

version of the ABox tableau algorithm reported in

(Sivaprakasam and Slutzki, 2017). In that paper, the

authors discussed in detail the proof of correctness of

modalized version of ABox tableau algorithm. Given

Σ and C

Σ,S

, this algorithm computes A

∗

. A

∗

is initial-

ized as A ∪ AX and is expanded by exhaustively ap-

plying expansion rules listed in Figure 1. The worst

case running time of the algorithm given in Figure 1

is O((| Σ | + | C

Σ,S

), for more details see Theorem

27 in (Kr

otzsch, 2012).

3.1 MapReduce Paradigm

In this paper, our goal is to cast ABox reasoning al-

gorithm given in Figure 1 in MapReduce framework.

First let us give a brief account of MapReduce pro-

cedure. MapReduce is a distributed computational

model for processing data in parallel on clusters of

machines. Typically, these machines are called as

’nodes’, (Dean and Ghemawat, 2008; Karloff et al.,

2010). The data set is partitioned into several subsets,

and each subset of the data set is assigned to an idle

node. Each node has two major computational tasks

namely map and reduce functions computations. De-

pending upon the applications, a user can custom de-

ﬁne the map and reduce functions. The abstract ver-

sion of the map and reduce functions is:

Map function

: The input for the map function is a

(key,value) pair, and output is a list of (key,value)

pairs. The formal representation of map function

is of the form

Map : (k,v) → list(k

)

Reduce function: This function collects values by

key, and then computes a value or a set of values

depending upon the user deﬁnitions. The general

representation of reduce function is of the form

Reduce : (k

,list(v

)) → list(v

000

where k, k

and k

are the keys, and v,v

and v

000

are the values.

Several off-the-shelf implementations are available

for the MapReduce procedure. The popular one is

Hadoop, see (Apache Software Foundation, 2010).

An interesting aspect of MapReduce approach is the

system level issues like ﬁxing the faults and distribu-

tion of data to the nodes are taken care of by the un-

derlying implementation. The programmer needs to

just deﬁne the map and reduce functions, see (Karloff

et al., 2010; Mutharaju et al., 2010).

3.2 MapReduce for ABox Reasoning

In this section, we will explain in more detail how

the completion rules given in Figure 1 can be refor-

mulated to a format that is suitable for MapReduce

framework. As explained in the previous section, the

important computational tasks in MapReduce frame-

work are computing map and reduce functions. Infor-

mally the MapReduce framework, in the context of

completion rules for reasoning, works as follows: (a)

based on the preconditions (ignoring the precondition

which makes sure that the duplicate inferences are

not computed), the map function identiﬁes a key and

(b) the reduce function, based on the key, completes

ICAART 2020 - 12th International Conference on Agents and Artiﬁcial Intelligence

710

the application of the rule. To illustrate this idea, we

choose the v- rule given in Figure 1. The precondi-

tions in this rule are C(a) ∈ A

∗

and C ⊆ D ∈ T . The

common concept among these preconditions is C, and

C can be used as a key. So the map function, in the

case of v- rule, computes the pairs (C,C(a) ∈ A) and

(C,C v D ∈ T ).

Expansion Rule Key

aux

-rule : if C(a) ∈ A

∗

and C

C u D ∈ C

Σ,S

, then

:= T

∪ {(C u D,C(a))};

-rule : if D(a) ∈ A

∗

and D

(C u D,C(a)) ∈ T

then A

∗

:= A

∗

∪ {C u D(a)};

−

-rule : if C u D(a) ∈ A

∗

, then C u D

∗

:= A

∗

∪ {C(a),D(a)};

∃

aux

-rule : if r(a,b) ∈ A

∗

and r

∃r.C ∈ C

Σ,S

, then

:= T

∪ {(∃r.C,r(a, b))};

∃

-rule : if C(b) ∈ A

∗

and C

(∃r.C, r(a,b)) ∈ T

then A

∗

:= A

∗

∪ {∃r.C(a)};

∃

−

-rule : if ∃r.C(a) ∈ A

∗

, then ∃r.C

∗

:= A

∗

∪ {r(a,w

),C(w

)},

where w

∈ W ;

v -rule : if C(a) ∈ A

∗

and C v D ∈ T , C

then A

∗

:= A

∗

∪ {D(a)};

H-rule : if r(a,b) ∈ A

∗

and r v s ∈ R , r

then A

∗

:= A

∗

∪ {s(a,b)}.

Figure 2: Reformulated ABox expansion rules.

Now, the reduce function collects pairs with the

same key and completes the application of v- rule

which means the assertion D(a) is added to the set

∗

. It is straightforward to cast u

−

-, ∃

−

-, v- and

H- rules directly into MapReduce format. To mod-

ify u

and ∃

- rules in MapReduce format, we use

a two step method which is similar to the one used

in (Mutharaju et al., 2010; Mutharaju, 2016). The

- rule displayed in Figure 1 is split into two rules

namely u

aux

and u

which are given in Figure 2.

aux

-rule computes a set T

whose elements are not

elements of the set A

∗

which is the output of the rea-

soning algorithm. The purpose of computing the set

is to provide necessary information to the reduce

function about the conjuncts and the object involved

so that reduce function can compute conjunction of

two concept assertions with the same object. Each el-

ement of the set T

is a pair whose ﬁrst part contains

information about the conjuncts and the second part

contains information about the object.

Similarly, the ∃

- rule displayed in Figure 1 is

split into two rules namely ∃

aux

and ∃

which are

given in Figure 2. ∃

aux

computes a set T

whose el-

ements are not elements of the set A

∗

. The elements

of the set T

provide information to the reduce func-

tion so that it computes a concept assertion involving

‘there exists’ constructor. Each element of the set T

is a pair whose ﬁrst part contains information about

the concept and the second part contains information

about the objects. The u

- and ∃

- rules given in Fig-

ure 2 compute the respective ABox assertions. The

sets T

and T

are initialized as

0. Note that the result

of the completion rules u

aux

and ∃

aux

do not follow

syntax of ELH language.

Each element of T

is a pair of preconditions of

aux

- rule. This rule is of the form p ∧ q ⇒ p ∧ q,

where p and q are propositional variables, which is a

tautology. Therefore, u

aux

- rule is sound. u

- rule in

Figure 2 simulates exactly the same result of the u

rule in Figure 1. The same argument holds true for the

∃

aux

- and ∃

- rules. The application of ∃

−

- rule on

the precondition ∃r.C(a) would result in adding the

assertions r(a,w

) and C(w

) into the set A

∗

even in

the case that the assertions r(a,b) and C(b) already

available in A

∗

for some witness b ∈ O

. Also ∃

−

rule is sound. The remaining all other rules in Fig-

ure 2 are same as the rules in Figure 1 except for the

preconditions which guarantee no duplication in the

output. The reformulated ABox reasoning algorithm

given in Figure 2 terminates if no application of any

of the rule given in Figure 2 is applicable. Every new

addition of ABox assertion into A is entailed by Σ.

Since the ABox reasoning algorithm in Figure 1 is

sound, complete and terminate so as the algorithm in

Figure 2. Hence the reformulated algorithm in Fig-

ure 2 is correct, and the worst case running time of

this ABox algorithm is cubic polynomial in the size

of input KB and the set C

Σ,S

3.3 Parallelization using MapReduce

for ABox Reasoning

In this section, we discuss how parallelization is

implemented in ABox reasoning procedure using

MapReduce approach. We consider the completion

rules given in Figure 2 with its respective keys which

are already reformulated in a way that they are suit-

able for implementation of parallelization method.

Here the inputs are the KB Σ = hA,T , R i, C

Σ,S

, T

and T

. The set A

∗

is initialized as A and is expanded

Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm

711

using the following strategy: The completion rules are

applied iteratively so that in a particular iteration, one

rule is applied. In a given iteration, the elements of

the sets A, T , R , C

Σ,S

, T

and T

are partitioned into

subsets. Each subset is distributed to different com-

puting nodes. Each node computes ﬁrst map function

and then computes reduce function. Each map-reduce

computational cycle results in the parallel application

of one of the completion rules. In the map compu-

tational step, based on the rule chosen for the appli-

cation, the elements from either the set A

∗

or from

the sets A

∗

and C

Σ,S

or from the sets A

∗

and T or

from the sets A

∗

and R which satisfy any of the pre-

conditions of the rule are selected and, the relevant

(key,value) pairs are computed. Here, key is a con-

cept or role name common to the preconditions of the

rule and the value is the preconditions itself.

Algorithm 1: MapReduce algorithm for the rule u

−

map(key,value)

/* key: line number (not relevant)*/

/* value: an assertion from ABox A

∗

if value == C u D(a) ∈ A

∗

then

return (C u D,C u D(a) ∈ A

∗

)

end if

reduce(key,values)

/* key: C u D (concept)*/

/* values: assertions from ABox A

∗

for all v in values do

if v == C u D(a) ∈ A

∗

then

return C(a), D(a) ∈ A

∗

end if

end for

In the reduce computational step, all the pairs hav-

ing the same key are collected from different nodes

and the results of the completion rule are obtained.

During this reduce computational step, all valid com-

binations of values are taken onto account. At the end

of each iteration, all the duplicate assertions are re-

moved from the set A

∗

. Iterations (map-reduce cycle)

are continued until no further new addition of asser-

tions into A

∗

. That is, the set A

∗

obtained at the end

of any two consecutive iterations remains same. The

strategy discussed here is similar to the strategies pre-

sented in (Mutharaju et al., 2010; Mutharaju, 2016).

Map and reduce functions of u

−

and v reformu-

lated ABox completion rules given in Figure 2 are

deﬁned in Algorithm 1 and Algorithm 2 respectively.

For the remaining rules in Figure 2, map and reduce

functions can be deﬁned in a similar way. Algorithm

1 describes map and reduce functions for the u

−

-rule.

The input for the map function is an element of A

∗

The output of the map function is a list of relevant

(key,value) pairs. A list of values of same key is ac-

cepted by the reduce function. At this point of time,

every possible combination of values is tested to make

sure the u

−

-rule is applicable. The output of the re-

duce function is a list of elements which is added to

the set A

∗

. Algorithm 2 describes map and reduce

functions for the v-rule. Since this algorithm is sim-

ilar to Algorithm 1, the discussion on Algorithm 2 is

omitted.

Algorithm 2: MapReduce algorithm for the rule v.

map(key,value)

/* key: line number (not relevant)*/

/* value: an assertion from ABox A

∗

or an ax-

iom from TBox T */

if value == C(a) ∈ A

∗

then

return (C,C(a) ∈ A

∗

)

else if value == C v D ∈ T then

return (C,C v D ∈ T )

end if

reduce(key,values)

/* key: C (concept)*/

/* values: assertions from ABox A

∗

or axioms

from TBox T */

for all v

in values do

for all v

in values do

if v

== C(a) ∈ A

∗

and v

== C v D ∈ T

then

return D(a) ∈ A

∗

end if

end for

As indicated in (Mutharaju, 2016), the algorithm

presented in this section can compute duplicate infor-

mation. In each map-reduce computation cycle, du-

plicate information will be removed. Removal of the

duplicates from the set A

∗

incurs additional computa-

tional cost which affects the performance of the algo-

rithm. The performance of this algorithm can be opti-

mized by using the distributed computational models

reported in (Mutharaju, 2016; Urbani et al., 2010).

4 SECRECY-PRESERVING

REASONING

Let Σ =

A,T ,R

be an ELH KB. Also let S ⊆

∗

\ AX be the “secrecy set” to be protected from

the querying agent. Given Σ and S, the objective

is to answer assertion queries while preserving se-

crecy. Our approach is to compute a set E, where

S ⊆ E ⊆ A

∗

\ AX, called the secrecy envelope for S,

ICAART 2020 - 12th International Conference on Agents and Artiﬁcial Intelligence

712

so that protecting E the querying agent cannot logi-

cally infer any assertion in S, see (Tao et al., 2015).

We brieﬂy explain the role of OWA in answering the

queries and how it helps protecting the secrets. When

answering a query with “Unknown”, the querying

agent should not be able to distinguish between the

case that the answer to the query is truly unknown to

the KB reasoner and the case that the answer is be-

ing protected for reasons of secrecy. We envision a

situation in which once the ABox A

∗

is computed,

a reasoner R is associated with it. R is designed to

answer queries as follows: If a query cannot be in-

ferred from Σ, the answer is “Unknown”. If it can be

inferred and it is not in E, the answer is “Yes”; other-

wise, the answer is “Unknown”. Note that since the

syntax of ELH does not include negation, an ELH

KB cannot entail a negative query.

We make the following assumptions about the ca-

pabilities of the querying agent:

(a) It does not have direct access to the KB Σ, but is

aware of the underlying vocabulary,

(b) It does not know about witness set W ,

(d) It cannot ask queries in the form of general con-

cept or role inclusions.

We formally deﬁne the notion of an envelope in

the following:

Deﬁnition 1. Let Σ =

A,T ,R

be a ELH KB, and

let S be a ﬁnite secrecy set. The secrecy envelope E of

S have the following properties:

- S ⊆ E ⊆ A

∗

\ AX, and

- for every α ∈ E, A

∗

\ E 6|= α.

The intuition for the above deﬁnition is that no in-

formation in E can be inferred from the set A

∗

\ E.

To compute an envelope, we use the idea of inverting

the rules of Figure 1 as given in (Tao et al., 2010; Tao

et al., 2015). Induced by the ABox expansion rules

in Figures 1, we deﬁne the corresponding “inverted”

ABox expansion rules in Figure 3. These inverted

expansion rules are denoted by preﬁxing Inv- to the

name of the corresponding expansion rules. Note that

the ∃

−

–rule does not have its corresponding inverted

rule. The reason for the ”omission” is that an applica-

tion of this rule results in adding assertions with indi-

vidual names from the witness set which the querying

agent is barred from asking about.

The envelope E is computed by initializing it to

S and then expanding it using the inverted expansion

rules listed in Figure 3 until no further applications are

possible. We denote by Λ

the algorithm which com-

putes the set E. Due to non-determinism in applying

the rules Inv-u

and Inv-∃

, different executions of

may output different envelopes. Since A

∗

is ﬁnite,

the computation of Λ

terminates. Let E be an output

of Λ

. Since the size of A

∗

is cubic polynomial in

| Σ |+| C

Σ,S

|, and each application of inverted expan-

sion rule moves some assertions from A

∗

into E, the

size of E is at most the size of A

∗

. Therefore, to com-

pute the envelope E, Λ

takes O((| Σ | + | C

Σ,S

The proof of correctness of Λ

is omitted.

Inv- u

-rule : if C u D(a) ∈ E, C u D ∈ C

Σ,S

and

{C(a),D(a)} ⊆ A

∗

\ E,

then E := E ∪ {C(a)} or

E := E ∪ {D(a)};

Inv- u

−

-rule : if {C(a),D(a)} ∩ E 6=

0 and

C u D(a) ∈ A

∗

\ E,

then E := E ∪ {C u D(a)};

Inv-∃

-rule : if ∃r.C(a) ∈ E, ∃r.C ∈ C

Σ,S

and

{r(a, b),C(b)} ⊆ A

∗

\ E

with b ∈ O

∗

then E := E ∪ {r(a,b)} or

E := E ∪ {C(b)};

Inv- v -rule : if D(a) ∈ E, C v D ∈ T and

C(a) ∈ A

∗

\ E,

then E := E ∪ {C(a)};

Inv-H-rule : if s(a, b) ∈ E, r v s ∈ R and

r(a, b) ∈ A

∗

\ E,

then E := E ∪ {r(a,b)}.

Figure 3: Inverted Tableau expansion rules.

4.1 MapReduce for Computing an

Envelope E

An important step in computing a secrecy envelope

for a given secrecy set using the MapReduce frame-

work is to cast the inverted ABox tableau expansion

rules given in Figure 3 to a format that is suitable for

MapReduce procedure. Since a detailed discussion

of various computational tasks in MapReduce frame-

work in the context of completion rules for reasoning

is given in section 3.2, we present a short explanation

about the modiﬁed inverted rules given in Figure 4. To

modify each rule in Figure 3 except for the rule Inv-

−

in MapReduce format, we use a two step method

as explained in detail in section 3.2. The Inv-u

- rule

displayed in Figure 3 is split into two rules namely

Inv-u

aux

and Inv-u

given in Figure 4. Inv-u

aux

rule computes a set T

whose elements are not ele-

ments of the set E. The purpose of computing the set

is to provide necessary information to the reduce

function of the rule Inv-u

. Using this information,

Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm

713

the reduce function computes an assertion that goes in

to E corresponding to a secret involving conjunction

constructor. Otherwise, that secret may be revealed.

Similarly, each inverted expansion rules Inv-∃

Inv-v and Inv-H displayed in Figure 3 is split into a

pair of rules. The ﬁrst rule in each of these pairs of

rules is an auxiliary rule Inv-∃

aux

, Inv-v

aux

and Inv-

aux

which computes respectively a set T

, T

and

. As explained in the previous paragraph, each of

these sets provide information to the reduced function

Inverted Expansion Rule Key

Inv-u

aux

: if C(a) ∈ A

∗

\ E and C

C u D(a) ∈ E, then

:= T

∪ {(C u D(a),C(a))};

Inv-u

: if D(a) ∈ A

∗

\ E and D

(C u D(a),C(a)) ∈ T

then E := E ∪ {C(a)} or

E := E ∪ {D(a)};

Inv-u

−

: if C u D(a) ∈ A

∗

\ E and C

C(a) ∈ E, then

E := E ∪ {C u D(a)};

Inv-u

−

: if C u D(a) ∈ A

∗

\ E and D

D(a) ∈ E, then

E := E ∪ {C u D(a)};

Inv-∃

aux

: if r(a, b) ∈ A

∗

\ E and r

∃r.C(a) ∈ E, then

:= T

∪ {(∃r.C(a),r(a,b))};

Inv-∃

: if C(b) ∈ A

∗

\ E and C

(∃r.C(a), r(a,b)) ∈ T

then E := E ∪ {r(a,b)} or

E := E ∪ {C(b)};

Inv- v

aux

: if C(a) ∈ A

∗

\ E and C

C v D ∈ T , then

:= T

∪ {(C v D,C(a))};

Inv- v: if D(a) ∈ E and D

(C v D,C(a)) ∈ T

then E := E ∪ {C(a)};

Inv-H

aux

: if r(a, b) ∈ A

∗

\ E and r

r v s ∈ R , then

:= T

∪ {(r v s,r(a, b))};

Inv-H : if s(a,b) ∈ E and s

(r v s,r(a,b)) ∈ T

then E := E ∪ {r(a,b)}.

Figure 4: Reformulated Inverted Tableau expansion rules.

of second rule in the respective pairs. The Inv-u

Inv-∃

, Inv-v and Inv-H- rules given in Figure 4

compute assertions that belong to set E. The sets T

, T

and T

are initialized as

0. Note that the result

of the completion rules u

aux

, ∃

aux

, v

aux

and H

aux

not follow syntax of ELH language.

The intuition behind the rule Inv-u

−

given in Fig-

ure 3 is that if either C(a) or D(a) is in E, then

C u D(a) should be in E. That is, to protect the secret

information C(a) or D(a), we need to add C u D(a)

into the set E. To cast this rule in MapReduce format,

we split it into two rules Inv-u

−

and Inv-u

−

. Each

rule computes the same assertion involving conjunc-

tion constructor. Since at the end of each map-reduce

cycle, the redundant information are removed, the ﬁ-

nal output E of the tableau algorithm given in Figure

4 is free from duplicate assertions.

As discussed in section 3.2, all the auxiliary rules

in Figure 4 are of the form p ∧ q ⇒ p ∧ q, where p

and q are propositional variables, which is a tautol-

ogy. Therefore, all the auxillary rules are sound. The

remaining rules in Figure 4 simulate exactly the same

result of the rules in Figure 3. Since the tableau algo-

rithm given in Figure 3 is correct so as the algorithm

given in Figure 4. Hence the reformulated inverted

tableau algorithm in Figure 4 is correct,

4.2 Parallelization using MapReduce

for Envelope Computation

A brief account of design and implementation of map

and reduce functions for each reformulated inverted

completion rules given in Figure 4 is discussed in this

section, for more details see section 3.3. We con-

sider the inverted expansion rules with its respective

keys which are already reformulated in a way that is

suitable for implementation of parallelization method.

The inputs are the KB Σ = hA

∗

,T , R i, C

Σ,S

, S, T

. T

and T

. The set E is initialized as S and is

expanded using the following strategy. The comple-

tion rules are applied iteratively so that in a particular

iteration, one rule is applied. In a given iteration, the

elements of the sets A

∗

, T , R , C

Σ,S

, S, T

, T

and T

are partitioned into subsets. Each subset is

distributed to different computing nodes. Each node

computes ﬁrst map function and then computes re-

duce function. Each map-reduce computational cycle

results in the parallel application of one of the rules.

At the end of each map-reduce cycle, all the duplicate

assertions are removed from the set E. Map-reduce

cycle are continued until no further new addition of

assertions into E. That is, the set E obtained at the

end of any two consecutive iterations remains same.

ICAART 2020 - 12th International Conference on Agents and Artiﬁcial Intelligence

714

Algorithm 3: MapReduce algorithm for Inv-u

−

map(key,value)

/* key: line number (not relevant)*/

/* value: an assertion from ABox A

∗

\ E or an

assertion from E */

if value == C u D(a) ∈ A

∗

\ E then

return (C,C u D(a) ∈ A

∗

\ E)

else if value == C(a) ∈ E then

return (C,C(a))

end if

reduce(key,values)

/* key: C (concept)*/

/* values: assertions from ABox A

∗

\ E or as-

sertions from the set E*/

for all v

in values do

for all v

in values do

if v

== C u D(a) ∈ A

∗

\ E and v

== C(a)

then

return C u D(a) ∈ E

end if

end for

Algorithm 4: MapReduce algorithm for Inv-v

aux

map(key,value)

/* key: line number (not relevant)*/

/* value: an assertion from ABox A

∗

\ E or an

axiom from TBox T */

if value == C(a) ∈ A

∗

\ E then

return (C,C(a) ∈ A

∗

\ E)

else if value == C v D ∈ T then

return (C,C v D)

end if

reduce(key,values)

/* key: C (concept)*/

/* values: assertions from ABox A

∗

or axioms

from TBox T */

for all v

in values do

for all v

in values do

if v

==C(a) ∈ A

∗

\E and v

==C v D then

return (C v D,C(a)) ∈ T

end if

end for

Map and reduce functions for Inv-u

−

, Inv-v

aux

and Inv-v reformulated inverted completion rules

given in Figure 4 are deﬁned in Algorithm 3 through

Algorithm 5. For the remaining rules in Figure 4, map

and reduce functions can be deﬁned in a similar way.

Algorithm 3 describes map and reduce functions for

the u

−

-rule. The input for the map function is an ele-

ment of A

∗

\ E and an element in E. The output of

Algorithm 5: MapReduce algorithm for Inv-v.

map(key,value)

/* key: line number (not relevant)*/

/* value: an assertion in E or an element in the

set T

if value == D(a) ∈ E then

return (D, D(a) ∈ E)

else if value == (C v D,C(a)) ∈ T

then

return (D, (C v D,C(a)))

end if

reduce(key,values)

/* key: C (concept)*/

/* values: assertions from the set E or elements

from the set T

for all v

in values do

for all v

in values do

if v

== D(a) ∈ E and v

== (C v D,C(a))

then

return C(a) ∈ E

end if

end for

the map function is a list of relevant (key,value) pairs.

A list of values of same key is accepted by the reduce

function. At this point of time, every possible combi-

nation of values is tested to make sure the u

−

-rule is

applicable. The output of the reduce function is a list

of elements which is added to the set E. The map and

reduce functions of remaining algorithms are similar

to Algorithm 3 and can be explained in the same way.

As discussed in section 3.3, the procedure that com-

putes envelope can output duplicate information. At

the end of each map-reduce computation cycle, du-

plicate assertions will be removed. Removal of the

duplicates from the set E incurs additional computa-

tional cost which affects the performance of this com-

putational procedure. The performance of this proce-

dure can be improved by using the distributed compu-

tational models reported in (Mutharaju, 2016; Urbani

et al., 2010).

5 SUMMARY

In this paper we have studied the problem of secrecy-

preserving reasoning in ELH KBs using MapRe-

duce framework. Let Σ =

A,T ,R

be an ELH KB

and assume that the size of the set A is very large.

Our main contribution in this paper is to use MapRe-

duce procedure within reasoning algorithms to com-

pute a ﬁnite set of assertional consequences A

∗

and

an envelope E which is super set of the secrecy set

S. To the best of our knowledge, secrecy-preserving

Secrecy-preserving Reasoning in ELH Knowledge Bases using MapReduce Algorithm

715

reasoning with MapReduce framework has not been

studied before. For the query answering part, we as-

sume that the sets A

∗

and E are precomputed. The

set A

∗

\ E is free from all the secret information,

and no secret information can be inferred from it.

The queries from the querying agent are answered

based on the information available in the set A

∗

\ E.

Note that A

∗

\ E is ﬁnite and does not contain an-

swer for all the non-conﬁdential queries. A recur-

sive query answering procedure is used to answer the

non-consequential queries. For this purpose,we adopt

the recursive query answering procedure reported in

(Sivaprakasam and Slutzki, 2017) with the necessary

changes. Our main emphasis in this paper is how

to use MapReduce framework within the reasoning

procedures to study the secrecy-preserving reasoning

problem. The implementation of this query answering

procedure in Hadoop tool (Apache Software Founda-

tion, 2010) will be considered in our future work. Fur-

ther, we will implement SPQA framework for a single

querying agent in Protege (Protege - Stanford Univer-

sity, 1999), a knowledge representation and reasoning

tool for ELH KBs. To study the performance of im-

plementation of SPQA framework for single query-

ing agent using MapReduce procedure, we will con-

duct experiments in very large ontologies SNOMED

CT and GALEN (Dentler et al., 2011) in both Protege

and Hadoop tools and compare their performances.

REFERENCES

Apache Software Foundation (2010). http://hadoop.

apache.org.

Bellomarini, L., Gottlob, G., Pieris, A., and Sallinger, E.

(2018). Swift logic for big data and knowledge graphs.

In International Conference on Current Trends in The-

ory and Practice of Informatics, pages 3–16. Springer.

Dean, J. and Ghemawat, S. (2008). Mapreduce: simpliﬁed

data processing on large clusters. Communications of

the ACM, 51(1):107–113.

Dentler, K., Cornet, R., Ten Teije, A., and De Keizer, N.

(2011). Comparison of reasoners for large ontologies

in the owl 2 el proﬁle. Semantic Web, 2(2):71–87.

Hitzler, P. and Janowicz, K. (2013). Linked data, big data,

and the 4th paradigm. Semantic Web, 4(3):233–235.

Karloff, H., Suri, S., and Vassilvitskii, S. (2010). A model

of computation for mapreduce. In Proceedings of

the twenty-ﬁrst annual ACM-SIAM symposium on Dis-

crete Algorithms, pages 938–948. SIAM.

Konstantopoulos, S., Charalambidis, A., Mouchakis, G.,

Troumpoukis, A., Jakobitsch, J., and Karkaletsis, V.

(2016). Semantic web technologies and big data in-

frastructures: Sparql federated querying of heteroge-

neous big data stores. In International Semantic Web

Conference (Posters & Demos).

otzsch, M. (2012). Owl 2 proﬁles: An introduction to

lightweight ontology languages. In Reasoning Web In-

ternational Summer School, pages 112–183. Springer.

oller, R., Neuenstadt, C.,

Ozc¸ep,

O. L., and Wandelt, S.

(2013). Advances in accessing big data with expres-

sive ontologies. In Annual Conference on Artiﬁcial

Intelligence, pages 118–129. Springer.

Mutharaju, R. (2016). Distributed rule-based ontology rea-

soning. PhD thesis, Wright State University.

Mutharaju, R., Maier, F., and Hitzler, P. (2010). A mapre-

duce algorithm for EL

. In 23rd International Work-

shop on Description Logics DL2010, volume 456.

Protege - Stanford University (1999). http://protege.

stanford.edu.

Reinsel, D., Gantz, J., and Rydning, J. (2017). Data age

2025.

Sivaprakasam, G. K. and Slutzki, G. (2016). Secrecy-

preserving query answering in ELH knowledge bases.

In Proceedings of the 8th International Conference

on Agents and Artiﬁcial Intelligence (ICAART 2016),

Volume 2, Rome, Italy, February 24-26., pages 149–

159.

Sivaprakasam, G. K. and Slutzki, G. (2017). Keeping se-

crets in modalized DL knowledge bases. In Proceed-

ings of the 9th International Conference on Agents

and Artiﬁcial Intelligence, ICAART 2017, Volume 2,

Porto, Portugal, February 24-26., pages 591–598.

Tao, J., Slutzki, G., and Honavar, V. (2010). Secrecy-

preserving query answering for instance checking in

EL. In International Conference on Web Reasoning

and Rule Systems, pages 195–203. Springer.

Tao, J., Slutzki, G., and Honavar, V. (2015). A concep-

tual framework for secrecy-preserving reasoning in

knowledge bases. ACM Transactions on Computa-

tional Logic (TOCL), 16(1):3.

Urbani, J., Kotoulas, S., Maassen, J., Van Harmelen, F., and

Bal, H. (2010). Owl reasoning with webpie: calcu-

lating the closure of 100 billion triples. In Extended

Semantic Web Conference, pages 213–227. Springer.

Zhou, Y., Cuenca Grau, B., Horrocks, I., Wu, Z., and Baner-

jee, J. (2013). Making the most of your triple store:

query answering in owl 2 using an rl reasoner. In

Proceedings of the 22nd international conference on

World Wide Web, pages 1569–1580. ACM.

ICAART 2020 - 12th International Conference on Agents and Artiﬁcial Intelligence

716