A Structural Subsumption based Similarity Measure
for the Description Logic ALEH
Boontawee Suntisrivaraporn and Suwan Tongphu
School of Information, Computer and Communication Technology,
Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand
Keywords:
Similarity Measure, Description Logic, Non-standard Reasoning, Semantic Web.
Abstract:
Description Logics (DLs) are a family of logic-based knowledge representation formalisms, which can be
used to develop ontologies in a formally well-founded way. The standard reasoning service of subsumption
has proved indispensable in ontology design and maintenance. This checks, relative to the logical definitions in
the ontology, whether one concept is more general/specific than another. When no subsumption relationship
is identified, however, no information about the two concepts can be given. This work extends from an
existing work on similarity measure in ELH to the more expressive description logic ALEH . We introduce
generalizations of the notion of normalization and homomorphism in ALEH which are then employed at the
heart of our semantic similarity measure. The proposed similarity measure computes a numerical degree of
similarity between two ALEH concept descriptions despite not being in the subsumption relation.
1 INTRODUCTION
Representing knowledge base is one interesting topic
in artificial intelligence. Among various techniques of
semantic-level analysis, one commonly well-founded
way is through the help of Description Logics (DLs)
(Baader et al., 2007). Being recommended by W3C,
DLs (i.e. the logical underpinning of the Web Ontol-
ogy Language (OWL)) become a standard tool for
formally and systematically modelling a knowledge
base. Besides their unambiguous syntax and seman-
tics which are essential for ontology modelling and
sharing, DLs provide several useful reasoning ser-
vices that allow inferencing of implicit knowledge
from the one explicitly defined. For example, with
a service of a subclass-superclass relation identifi-
cation (concept subsumption), two defined concepts
which are visually out of subsumption relation may be
logically classified into the same hierarchy. Though
seemingly useful, the classical DL reasoning service
of concept subsumption merely produces a crisp re-
sponse. The service indeed provides a positive con-
clusion if and only if all necessary and sufficient con-
ditions of being in the subclass–superclass relation are
fully satisfied. Otherwise, alas, it will suggest that the
two concepts are irrelevant to each other.
In some concrete situation, checking for subsump-
tion relation may not be adequate. Consider for exam-
ple the case in which a new disease, which is closely
similar to the existing one, is being discovered. Since
we know that the two diseases are similar, checking
for their common characteristics would likely provide
a beneficial clue to the disease etiology. Therefore,
it would be easy to suggest an appropriate treatment
from previously known diseases to another new one.
This work is an extension of an existing similarity
measures for DLs in the EL family (Suntisrivaraporn,
2013; Tongphu and Suntisrivaraporn, 2014; Tongphu
and Suntisrivaraporn, 2015) to the strictly more ex-
pressive DL ALEH . The method is based on the
known homomorphism-based structural subsumption
and produces a numerical degree of similarity be-
tween two ALEH concept descriptions despite not
being in the subsumption relation.
The rest of the paper is organized in order. The
background on the DL ALEH , unfoldable TBoxes,
and the structural subsumption algorithm is presented
in the next section. Section 3 and 4 introduce the no-
tions of homomorphism degree and ALEH seman-
tic similarity measure, respectively, and exemplify the
introduced measure by means of a small yet prototyp-
ical medical ontology. Section 5 suggests a possible
extension of similarity measure for the DL ALC H .
Related works are discussed in Section 6, and the last
section gives some concluding remarks.
204
Suntisrivaraporn, B. and Tongphu, S.
A Structural Subsumption based Similarity Measure for the Description Logic A LEH .
DOI: 10.5220/0005819302040212
In Proceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016) - Volume 2, pages 204-212
ISBN: 978-989-758-172-4
Copyright
c
2016 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
2 BACKGROUND
In DLs, concept descriptions are inductively defined
with the help of a set of constructors, starting with
a set CN of concept names and a set RN of role
names. ALEH concept descriptions are formed us-
ing the constructors shown in the upper part of Ta-
ble 1. An ALEH terminology or TBox is a finite
set of concept definitions and role hierarchy axioms,
of which the syntactic forms are shown in the lower
part of Table 1. A TBox is called unfoldable if it is
definitorial (i.e. containing at most one concept def-
inition for each concept name), and acyclic (i.e. not
containing cyclic dependencies). Figure 1 depicts an
example unfoldable ALEH TBox. The set CN
def
of defined concepts comprises those concept names
that appear on the left hand side of a concept defi-
nition. All other concept names are called primitive
concepts, denoted by CN
pri
. Since the DL ALEH al-
lows for atomic negation, for convenience, we denote
by CN
label
the set of all primitive concepts, their nega-
tions, and the bottom concept, i.e. CN
label
= { A,¬A |
A CN
pri
} {⊥}. Conventionally, r,s possibly with
subscripts are used to range over R N, A, B to range
over CN, and C, D to range over concept descriptions.
Primitive concept definitions are commonly found in
realistic terminologies to define those concepts, of
which only necessary conditions are known. For in-
stance,
HappyMan Man Rich child.Beautiful
(1)
Such a primitive definition B D can easily be
transformed into a semantically equivalent full defini-
tions B X D where X is a fresh concept name.
Like other DLs, the semantics of ALEH is de-
fined in terms of interpretations I = (
I
,·
I
), where
the domain
I
is a non-empty set of individuals, and
the interpretation function ·
I
maps each concept name
A CN to a subset A
I
of
I
and each role name
r RN to a binary relation r
I
on
I
. The extension
of ·
I
to arbitrary concept descriptions is inductively
defined, as shown in the semantics column of Table 1.
An interpretation I is a model of a TBox O if, for
each concept definition in O, the conditions given in
the semantics column of Table 1 are satisfied. The
main inference problem for ALEH is the subsump-
tion problem:
Definition 1 (Concept Subsumption). Given two
ALEH concept descriptions C, D and an ALEH
TBox O, C is subsumed by D w.r.t. O (writtenC
O
D)
if C
I
D
I
in every model I of O. Moreover, C, D are
equivalent w.r.t. O (written C
O
D) if C
O
D and
D
O
C.
ω
1
Woman Female Person
ω
2
Man ¬Female Person
ω
3
Parent Person child.Person
ω
4
Mother Woman Parent
ω
5
Father Man Parent
ω
6
MotherNoSon Mother child.Woman
ω
7
MotherNoDaughter Mother child.Man
ω
8
FosterFather Man fchild.Person
ω
9
NonFosterFather Father fchild.
ω
10
fchild child
Figure 1: An example ALEH terminology O
family
;
here child and fchild are shorthands for hasChild and
hasFosterChild, respectively.
Provided that the TBox is unfoldable (i.e. acyclic and
definitional), any ALEH concept description can be
expanded to an equivalent one that may use any role
names but consists only of primitive concept names,
their negations and the bottom concept from CN
label
.
This can be done by repeatedly replacing a defined
concept by its definition until no more defined con-
cepts appear in the concept description. Consider,
for instance, the concept MotherNoSon along with its
definition ω
6
in Figure 1. By replacing the defined
concept Mother and Woman with their correspond-
ing descriptions (see ω
4
and ω
1
), the description can
be expanded to:
Female Person child.Person
child.(Female Person)
(2)
where Person,Female CN
label
. We denote by
ˆ
C the
expanded equivalence of the concept description C.
We can assume without loss of generality that an
ALEH concept C can be expanded and has the fol-
lowing form:
l
l
i=1
L
i
m
l
j=1
r
j
.D
j
n
l
k=1
s
k
.E
k
where L
i
CN
label
, r
j
,s
k
RN, and D
j
,E
k
are
ALEH concept descriptions in the same for-
mat as C. For simplicity, we assign P
C
:=
{L
1
,. . . , L
l
}, E
C
:= {∃r
1
.D
1
,. . . ,r
m
.D
m
}, and A
C
:=
{∀s
1
.E
1
,. . . , s
n
.E
n
}. Also, we denote by R
r
and
R
r
the sets of all super-roles and of all sub-roles of
r, respectively. That is, R
r
= { s RN | r
s} and
R
r
= {t RN | t
r} where where
represents
the reflexive-transitive closure of over role names.
However, since a normalized ALEH concept de-
scription makes implicit description explicit and yet
A Structural Subsumption based Similarity Measure for the Description Logic ALE H
205
Table 1: Syntax and semantics of the DL ALEH and DL ALCH .
Name Syntax Semantics ALEH ALC H
bottop
/
0 X X
top
I
X X
concept name A A
I
I
X X
atomic negation ¬A
I
\A X X
concept negation ¬C
I
\C X
concept conjunction C D C
I
D
I
X X
concept disjunction C D C
I
D
I
X
existential restriction r.C {x | y
I
: (x,y) r
I
y C
I
} X X
value restriction r.C {x | y
I
: (x,y) r
I
y C
I
} X X
primitive definition
B D A
I
D
I
X X
full definition B D A
I
= D
I
X X
role hierarchy r s r
I
s
I
X X
preserves equivalence, we exhaustively apply the fol-
lowing normalization rules to the ALEH concept de-
scriptions after expansion. The normalization rules
below are modulo commutativity of conjunction:
s.C r.D s.C r.(C D)
s.C r.D s.C r.(C D)
r.
C C
A ¬A
r.
C
where s R
r
. Note that the first two normalization
rules generalize the corresponding ones in (Baader
and K¨usters, 2006) where a role hierarchy is taken
into consideration. In fact, for a super-role s of r, it is
the case that s.C implies r.C.
For example, let MotherNoSon be expanded and
has the form as shown in Equation 2. By applying
the above rules, a normalized concept description of
MotherNoSon can be exemplified as follows:
Female Person child.(Female Person)
child.(Female Person)
In (Baader and K¨usters, 2000; Baader, 2003),
a characterization of subsumption in ALEH w.r.t.
an unfoldable TBox using homomorphism has been
proposed. Instead of considering concept descrip-
tions directly, the characterization considers so-called
ALEH description trees that structurally correspond
to the ALEH concept descriptions. Given the ex-
panded concept descriptionC, beginning from the top
level, such a description can recursively be translated
into an ALEH description tree G
C
:= (V, E, v
0
,ℓ, ρ)
where V is a set of nodes, E V ×V is a set of edges,
v
0
V is the root, : V 2
CN
label
is a node labelling
function, and ρ : E 2
RN
is an edge labelling func-
tion. The translation can be done using the following
steps:
i. Assign P
C
to (v
0
).
ii. For each r.X
j
E
C
, introduce a new node w to
V, add an edge (v
0
,w) to E, and assign R
r
to
ρ(v
0
,w). Repeat from step (i) by treating w as v
0
and X as C.
iii. For each r.Y
j
A
C
, introduce a new node w
to
V, add an edge (v
0
,w
) to E, and assign R
r
to
ρ(v
0
,w
). Repeat from step (i) by treating w
as
v
0
and Y as C.
In essence, the root v
0
of the ALEH description tree
G
C
has P
C
as its label; has m existential edges, each
labeled with R
r
j
to a vertex w
j
; and has n universal
edges, each labeled with R
s
k
to a vertex w
k
, for 1
j m and 1 k n. Each of the child nodes w
j
and
w
k
is the root of a similar tree structure which forms
a subtree of G
C
.
Definition 2 (Homomorphism). A homo-
morphism from an ALEH description tree
G = (V, E, v
0
,ℓ, ρ) into an ALEH description
tree G
= (V
,E
,v
0
,
,ρ
) is a mapping h : V V
such that:
i. h(v
0
) = v
0
,
ii. (v)
(h(v)) for all v V,
iii. for each existential edge (v, w) E with ρ(v,w) =
R
r
, there exists (h(v), h(w)) E
such that
ρ
(h(v),h(w)) = R
s
and R
r
R
s
, and
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
206
iv. for each universal edge (v, w) E with ρ(v,w) =
R
r
, there exists (h(v), h(w)) E
such that
ρ
(h(v),h(w)) = R
t
and either of the following
holds:
a. R
r
R
t
, or
b. h(v) = h(w) and
(h(v)) = { ⊥}.
Observe that this generalizes the notion of ho-
momorphism first introduced in (Baader and K¨usters,
2006) by allowing each edge label to be a set of role
names instead of a mere role name. Moreover, it sim-
plifies the condition for existential edge mapping by
omitting since any existential successor with as
its label must be collapsed due to the normalization.
The subsumption is then characterized by means
of an existence of a homomorphism in the reverse di-
rection.
Theorem 1 ((Baader and K¨usters, 2006)). Let C, D
be ALEH concept descriptions, and G
C
,G
D
the cor-
responding ALEH concept description trees. Then,
C D iff there exists a homomorphism h : G
D
G
C
which maps the root of G
D
to the root of G
C
.
Consider the normalized description for
MotherNoSon as previously mentioned and the
following normalized descriptions for Mother and
NonFosterFath er:
Female Person child.Person
(3)
¬Female Person child.Person fchild.
(4)
Figure 2 depicts the ALEH description
trees G
NonFosterFather
(left) G
Mother
(center), and
G
MotherNoSon
(right). It is important to note here
that R
child
= {fchild,child} and R
fchild
= {fchild}
since ω
10
is the only role hierarchy axiom in the
ontology. This figure shows a homomorphism h
(dashed arrows) that maps the root u
0
of G
Mother
to the root v
0
of G
MotherNoSon
. It also demonstrates
a failed attempt to map (see the dotted arrow) the
root of G
Mother
to the root of G
NonFosterFather
. The-
orem 1 ensures that MotherNoSon
O
Mother and
NonFosterFath er 6⊑
O
Mother.
Though sharing some common features between
MotherNoSon and NonFosterFather (i.e. both are
Person ), the classical reasoning of subsumption can-
not tell how similar the two descriptions are. This
leads to an introduction of a concept similarity mea-
sure based on the structural characterization. Instead
of merely giving either positive or negative result be-
tween descriptions, the proposed measure calculates a
numerical value ranging between 0 and 1. Intuitively,
the larger the number approaching to 1, the more sim-
ilar the two concepts are.
3 HOMOMORPHISM DEGREE IN
ALEH
As suggested by Theorem 1, an existence of a homo-
morphism mapping from one ALEH description tree
to another implies a subsumption relationship in a re-
verse direction. We extend the idea to the case where
a homomorphism between the two ALEH descrip-
tion trees does not exist but there is a shared structure.
LetC, D be ALEH concept descriptions, and G
C
and
G
D
be the corresponding ALEH description trees.
Also, let P
C
,P
D
,E
C
,E
D
,A
C
, and A
D
be as defined in
the previous section. We define the homomorphism
degree from G
D
to G
C
as follows:
Definition 3 (Homomorphism Degree). Let
G
ALEH
be the set of all ALEH descrip-
tion trees. The homomorphism degree function
hd : G
ALEH
× G
ALEH
[0, 1] is defined as follows:
hd(G
D
,G
C
) := (1 µ
e
µ
a
) · p-hd(P
D
,P
C
)+
µ
e
· e-set-hd(E
D
,E
C
)+
µ
a
· a-set-hd(A
D
,A
C
)
(5)
where | · | represents the set cardinality, µ
e
=
|E
D
|
|P
D
E
D
A
D
|
, and µ
a
=
|A
D
|
|P
D
E
D
A
D
|
;
p-hd(P
D
,P
C
) :=
(
1 if P
D
=
/
0 or P
C
= {⊥}
|P
D
P
C
|
|P
D
|
otherwise,
(6)
e-set-hd(E
D
,E
C
) :=
1 if E
D
=
/
0
0 if E
D
6=
/
0, E
C
=
/
0
ε
i
E
D
max{e-hd(ε
i
,ε
j
):ε
j
E
C
}
|E
D
|
otherwise,
(7)
where ε
i
,ε
j
are existential restrictions;
e-hd(r.X, s.Y) :=
γ
e
(ν
e
(r) + (1 ν
e
(r)) · hd(G
X
,G
Y
))
(8)
where γ
e
=
|R
r
R
s
|
|R
r
|
and ν
e
: RN [0,1).
a-set-hd(A
D
,A
C
) :=
1 if A
D
=
/
0,
0 if A
D
6=
/
0, A
C
=
/
0,
α
i
A
D
max{a-hd(α
i
,αj):α jA
C
}
|A
D
|
otherwise
(9)
where α
i
,α
j
are universal restrictions; and finally
A Structural Subsumption based Similarity Measure for the Description Logic ALE H
207
w
0
: Female,Person}
w
1
: {Person}w
2
: {⊥}
u
0
: {Female,Person}
u
1
: {Person}
v
0
: {Female,Person}
v
1
: {Female,Person} v
2
: {Female,Person}
∃{child}
∃{child} ∀{fchild,child}
∃{child}
∀{fchild}
×
Figure 2: A homomorphism h (dashed arrows) that maps the root of G
Mother
to the root of G
MotherNoSon
; a failed attempt to
identify a homomorphism (dotted arrows) that maps the root of G
Mother
to the root of G
NonFosterFather
.
a-hd(r.X, s.Y) :=
γ
a
if P
Y
= {⊥},
γ
a
(ν
a
(r) + (1 ν
a
(r)) · hd(G
X
,G
Y
)) otherwise
(10)
where γ
a
=
|R
r
R
s
|
|R
r
|
and ν
a
: RN [0, 1).
Note that since r. can never occur in any nor-
malized ALEH concept description, we need not
treat this case in Equation 7 (cf. Definition of homo-
morphism in the previous section and in (Baader and
K¨usters, 2006)). Intuitively, the homomorphism de-
gree (hd) of the two given ALEH description trees
can be computed based on the degree of common
node label inclusion and the degree of common out-
going edges. Formula 6 calculates the proportion of
the matched node labels comparing to all those avail-
able in the top level. Formula 7 and 9 computes the
degrees of edge matching of an existential restriction
and a universal restriction, respectively. If there is a
shared edge label, then there is some degree of sim-
ilarity; but the successors’ labels and structures have
yet to be checked. This is done recursively by calling
the function hd(G
X
,G
Y
).
The parameter µ
e
(resp., µ
a
) defined in Formula 5
indicates how important the existentially quantified
(resp., universally quantified) subconcepts are to be
considered for similarity measure. The use of ν
e
and ν
a
allows to indicate an importance of the role
name in an existential restriction and a universal re-
striction. It is similar to that in (Suntisrivaraporn,
2013) except that these are defined as a function on
role names. This means that the importance of dif-
ferent role names and thus the discount of similarity
between nested concepts can be unequally assigned
based on their use and modelling discipline in a par-
ticular ontology. The value of γ in Formula 8 and 10
indicates a degree of inclusion between the two edge
labels. The case where γ = 0 means there is no com-
monality between two given roles, and hence further
computation for the degrees of membership between
their corresponding nested pairs should be omitted.
Example. To better understand how the algorithm
works, consider the description tree G
Mother
for
the unfolding of Mother and the description tree
G
NonFosterFather
for the unfolding of NonFosterFather
as shown in Figure 2. Using µ as previously described
and fixing ν
(r) to 0.4 for every role name r RN, the
degrees of homomorphism from the root of G
Mother
to the root of G
NonFosterFather
can be computed as fol-
lowing steps (abbreviations are used for the sake of
simplicity):
hd(G
M
,G
NFF
)
:=
2
3
p-hd(P
M
,P
NFF
) +
1
3
e-hd(E
M
,E
NFF
)+
(0)a-hd(A
M
,A
NFF
)
:=
2
3
[
1
2
] +
1
3
e-hd(ε
i
,ε
j
)
// with µ
e
=
1
3
, µ
a
= 0,
// ε
i
= child.Person and ε
j
= child.Person
:=
2
3
[
1
2
] +
1
3
[
1
1
][
2
5
+
3
5
hd(G
Person
,G
Person
)]
:=
2
3
[
1
2
] +
1
3
[
2
5
+
3
5
[
1
1
]]
:=
2
6
+
1
3
:= 0.67
The reverse direction can be computed as follows:
hd(G
NFF
,G
M
)
:=
2
4
p-hd(P
NFF
,P
M
) +
1
4
e-hd(E
NFF
,E
M
)+
1
4
a-hd(A
NFF
,A
M
)
:=
2
4
[
1
2
] +
1
4
e-hd(ε
i
,ε
j
) +
1
4
a-hd(α
i
,α
j
)
// with µ
e
=
1
4
, µ
a
=
1
4
,
// ε
i
= child.Person and ε
j
= child.Person
// α
i
= fchild. and α
j
=
/
0
:=
2
4
[
1
2
] +
1
4
[
1
1
][
2
5
+
3
5
hd(G
Person
,G
Person
)] +
1
4
[0]
:=
1
4
+
1
4
:= 0.50
Hence, the degree of having a homomorphism
from the root of G
Mother
to G
NonFosterFather
is 0.67, and
that for the opposite direction is 0.50. The hd values
for other pairs can be obtained in an analogous man-
ner and are shown in Table 2.
Using a proof by induction, together with Theo-
rem 1 (Baader and K¨usters, 2000; Baader, 2003), it is
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
208
Table 2: Homomorphism degrees to and from the defined concepts in O
family
.
hd(, ) Woman Man Parent Mother Father MNS MND FF NFF
Woman 1.00 0.50 0.50 0.67 0.33 0.50 0.50 0.33 0.25
Man 0.50 1.00 0.50 0.33 0.67 0.25 0.25 0.67 0.50
Parent 0.50 0.50 1.00 0.67 0.67 0.43 0.43 0.50 0.50
Mother 1.00 0.50 1.00 1.00 0.67 0.68 0.68 0.50 0.50
Father 0.50 1.00 1.00 0.67 1.00 0.43 0.43 0.83 0.75
MotherNoSon (MNS) 1.00 0.50 1.00 1.00 0.67 1.00 0.85 0.50 0.60
MotherNoDaughter (MND) 1.00 0.50 1.00 1.00 0.67 0.85 1.00 0.50 0.60
FosterFather (FF) 0.50 1.00 1.00 0.67 1.00 0.43 0.43 1.00 0.75
NonFosterFath er (NFF) 0.50 1.00 1.00 0.67 1.00 0.55 0.55 0.83 1.00
not difficult to obtain the correspondence between the
homomorphism degree and subsumption.
Proposition 2. Let C, D be expanded and normalized
ALEH concept descriptions, and G
C
, G
D
be their
corresponding description trees, respectively. Then,
the following are equivalent:
1. C D,
2. hd(G
D
,G
C
) = 1.
In fact, the closer the hd(G
D
,G
C
) value is to 1, the
more likely the corresponding subsumption may hold.
More precisely, the label and edge constraints in G
D
can likely be simulated by those in G
C
.
4 ALEH SEMANTIC
SIMILARITY
The homomorphism degree function introduced in
Section 3 returns a degree that represents the sim-
ilarity of one concept description compared to an-
other concept description. As shown in the compu-
tation example, the direction of the homomorphism
degree matters, viz., hd(G
M
,G
NFF
) = 0.67, whereas
hd(G
NFF
,G
M
) = 0.50. Since both directions consti-
tute the degree of the two concepts being equivalent,
our similarity measure for ALEH concept descrip-
tions is defined by means of these values.
Definition 4 (ALEH Concept Similarity). Let C, D
be expanded ALEH concept descriptions. The
degree of similarity between C and D is defined as:
sim
ALEH
(C,D) :=
hd(G
C
,G
D
) + hd(G
D
,G
C
)
2
(11)
Intuitively, the degree of similarity between two
concepts is the average of the degree of having ho-
momorphisms in both directions, thus sim(C, D) =
sim(D,C) as required.
1
1
Note that other functions apart from average could be
applied; for instance, root mean square and multiplication
(Suntisrivaraporn, 2013).
Based on the homomorphism degree values in Ta-
ble 2, the degrees of similarity among the defined
concepts in the example ontology O
family
can be ob-
tained; see Table 3. Note also that, though not in-
cluded in Table 2 and 3, the similarity involving prim-
itive concepts like Female and Person can also be
computed. Nevertheless, the pairwise similarity de-
gree between any two primitive concepts is zero by
our definition since there is absolutely no commonal-
ity between them apart from both being subsumed by
.
The similarity measure sim
ALEH
generalizes sim for
the DL ELH (Suntisrivaraporn, 2013; Tongphu and
Suntisrivaraporn, 2015) in the sense that when two
given concept descriptions are restricted to ELH ,
then both measures coincide.
Proposition 3. Let C, D be two ELH concept de-
scriptions. Then,
sim
ALEH
(C,D) = sim(C, D).
This is the case since any ELH description tree is
also an ALEH description tree that does not contain
universal edges.
5 APPROXIMATING ALCH
SEMANTIC SIMILARITY
A description logic ALC H can be considered as an
extension of ALEH that supports more concept con-
structors, namely disjunction and full concept nega-
tion (see Table 1). Since DL ALEH is a language in
the family DL ALC H , in this section, we show that
the notion of ALEH similarity measure can be ex-
tended to a new notion of ALC H similarity measure.
In Section 3 we review the structural characteri-
zation of subsumption ALEH through a homomor-
phism. Alas, this characterization is not directly ap-
plicable to the more expressive language ALCH due
to disjunction. Fortunately, one can approximate an
A Structural Subsumption based Similarity Measure for the Description Logic ALE H
209
Table 3: Similarity degree between a pair of defined concepts in O
family
.
hd(,) Woman Man Parent Mother Father MNS MND FF NFF
Woman 1.00 0.50 0.50 0.83 0.42 0.75 0.75 0.42 0.38
Man 1.00 0.50 0.42 0.83 0.38 0.38 0.83 0.75
Parent 1.00 0.83 0.83 0.71 0.71 0.75 0.75
Mother 1.00 0.67 0.84 0.84 0.58 0.58
Father 1.00 0.55 0.55 0.92 0.88
MotherNoSon (MNS) 1.00 0.85 0.46 0.58
MotherNoDaughter (MND) 1.00 0.46 0.58
FosterFather (FF) 1.00 0.79
NonFosterFath er (NFF) 1.00
ALC H -concept description in the less expressive DL
ALEH . Once approximation is calculated, the simi-
larity measure introduced in this paper could be used
to obtain approximate similarity between two concept
descriptions written in ALCH .
Definition 5 (Approximation). (Baader and K
¨
usters,
2006) Let C be an ALCH -concept description.
An ALEH -concept description D is an ALEH -
approximation of C, written ALEH -approx(C), iff
i. C D and
ii. D E for every ALEH -concept description E
with C E.
Intuitively, an approximation is the most specific
concept in ALEH that subsumes the given ALC H
concept. One can approximate an ALCH con-
cept by resorting to nding commonalities among
sub-concepts in a disjunction, also known as the
least common subsumer (lcs) problem (Turhan, 2007;
Baader et al., 1998).
We define the notion of similarity measure be-
tween two ALCH concept descriptions as follows:
Definition 6 (ALC H Concept Similarity). Let X,Y
be ALC H concept descriptions. The degree of sim-
ilarity between X and Y, in symbols sim
ALCH
(X,Y),
is defined as:
sim
ALC H
(X,Y) :=
sim
ALE H
(ALEH -approx(X),ALEH -approx(Y))
An analogous idea can be employed to compute
concept similarity in another DLs and yet using an-
other similarity measure. For instance, it is possible
to approximate ELU-concept descriptions (EL ex-
tended with disjunction) and then compute similar-
ity using the known measure for EL (Lehmann and
Turhan, 2012; Suntisrivaraporn, 2013). It remains
however to be shown whether this produces accept-
able similarity results in practice.
6 RELATED WORKS
The subject of concept similarity has been widely
studied. The techniques can be roughly classified into
two main groups: a structure-based approach and an
edit-distance-based approach.
In (Distel et al., 2014), the authors introduced a
new framework of concept similarity measure. This
framework is based on a counting of relaxation oper-
ations. A similarity is defined by means of the dis-
tance between concept descriptions C and D, i.e. the
number of times D needs to be relaxed before it sub-
sumes C. The method is claimed to satisfied several
properties of concept similarity but has not yet been
implemented.
A measure proposed by (Ge and Qiu, 2008) cal-
culates a degree of similarity based on the depth of
a concept defined in different levels of the ontolog-
ical hierarchy. The method considers the distance
relationship (subsumption relation) between concepts
and assigned different weights to the role depth. The
degree of similarity between two concepts was mea-
sured by means of a distance (a propagation of all la-
bel weights) to their least common subsumer. Simi-
lar approaches were proposed in (Ge and Qiu, 2003;
Giunchiglia et al., 2007). Despite their usefulness in
structural analysis, these methods were fully relied on
an ontology hierarchy and usually ignored constraints
of concept definitions in the ontology.
A simple method for similarity measure in the ba-
sic DL L
0
(i.e. no use of roles) was proposed in (Jac-
card, 1901), known as Jaccard Index). An extension
thereof to the DL ELH was proposed in (Lehmann
and Turhan, 2012). The extended work suggested
a new framework that satisfies several properties for
similarity. While the framework is defined in general,
the functions and operators needed for the computa-
tion are parameterized and thus left to be specified.
Moreover, the framework does not contain implemen-
tation details.
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
210
The notion of homomorphism degree was origi-
nally introduced in (Suntisrivaraporn, 2013) and em-
ployed as the heart of the similarity measure for the
DL EL. This has been extended to ELH and con-
tinuously studied in (Tongphu and Suntisrivaraporn,
2014; Tongphu and Suntisrivaraporn, 2015).
Racharak and Suntisrivaraporn suggested two new
notions of similarity for the DL F L
0
(Racharak and
Suntisrivaraporn, 2015). Both the skeptical and cred-
ulous similarity measures are derived from the known
structural characterization subsumption through in-
clusion of regular languages.
The similarity measure presented in this paper is
similar to those reported in (Tongphu and Suntisri-
varaporn, 2014; Suntisrivaraporn, 2013). It however
focuses on the strictly more expressive DL and em-
ploys generalizations of the normalization and char-
acterization from (Baader and K¨usters, 2006).
7 DISCUSSIONS AND FUTURE
WORKS
This paper presents a new notion of concept similarity
for the DL ALEH w.r.t. an unfoldable terminology
and suggests a way to approximate concept similarity
for the more expressive ALC H . At the heart of the
measure is the calculation of the degree of homomor-
phism to and from between two description trees. To
allow this, we first review and extend the known nor-
malization and homomorphism to take into account
also role hierarchy axioms. The proposed similarity
measure can be regarded as an extension of the sim-
ilarity measure sim for the EL family (Suntisrivara-
porn, 2013; Tongphu and Suntisrivaraporn, 2015).
There are various directions for future works. One
could try to evaluate the proposed measure on appro-
priate ontologies from real-world domains. Similar to
the experiments on SNOMED CT reported in (Tong-
phu and Suntisrivaraporn, 2015), a similar setting can
be carried out. Besides, more expressive ontologies
that make use of the universal quantification such as
GALEN could be experimented upon. It can be ex-
pected to find out new hidden knowledge in the on-
tology that could not have been done before with the
mere standard reasoner. Another useful application
is a measure of similarity between diseases proposed
in (Mathur and Dinakarpandian, 2012). The appli-
cation has shown useful cases in similarity measure
processes underlying each disease for more accurate
unknown disease prediction.
Concerning the choice of representation lan-
guage, it is an obvious future work to explore non-
approximate similarity measure for ALC by investi-
gating under scrutiny into the original tableau algo-
rithm. Another direction for future work could be
to compare the measure presented in this paper to
those two notions of similarity for F L
0
introduced in
(Racharak and Suntisrivaraporn, 2015). Since F L
0
is a sub-logic of ALEH and as such sim
ALEH
is
applicable also to F L
0
, it is interesting to explore
whether sim
ALEH
is stronger (see (Racharak and
Suntisrivaraporn, 2015)) than the skeptical and cred-
ulous similarity measures.
ACKNOWLEDGEMENTS
This research is partially supported by Thammasat
University Research Fund under the TU Research
Scholar, Contract No. TOR POR 1/13/2558; the Cen-
ter of Excellence in Intelligent Informatics, Speech
and Language Technology, and Service Innovation
(CILS), Thammasat University; and the National Re-
search University (NRU) project of Thailand Office
for Higher Education Commission.
REFERENCES
Baader, F. (2003). Terminological cycles in a descrip-
tion logic with existential restrictions. In Gottlob, G.
and Walsh, T., editors, Proceedings of the 18th Inter-
national Joint Conference on Artificial Intelligence,
pages 325–330. Morgan Kaufmann.
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., and
Patel-Schneider, P., editors (2007). The Description
Logic Handbook: Theory, Implementation and Appli-
cations. Cambridge University Press, second edition.
Baader, F. and K¨usters, R. (2000). Matching in descrip-
tion logics with existential restrictions. In A.G. Cohn,
F. Giunchiglia, and B. Selman, editors, Proceedings of
the Seventh International Conference on Knowledge
Representation and Reasoning (KR2000), pages 261–
272, San Francisco, CA. Morgan Kaufmann Publish-
ers.
Baader, F. and K¨usters, R. (2006). Nonstandard inferences
in description logics: The story so far. In Gabbay,
D., Goncharov, S., and Zakharyaschev, M., editors,
Mathematical Problems from Applied Logic I, vol-
ume 4 of International Mathematical Series, pages 1–
75. Springer-Verlag.
Baader, F., K¨usters, R., and Molitor, R. (1998).
Computing least common subsumers in Descrip-
tion Logics with existential restrictions. LTCS-
Report LTCS-98-09, LuFG Theoretical Computer
Science, RWTH Aachen, Germany. See http://www-
lti.informatik.rwth-aachen.de/Forschung/Papers.html.
Distel, F., Atif, J., and Bloch, I. (2014). Concept dissimi-
larity with triangle inequality. In Proceedings of the
Fourteenth International Conference on Principles of
A Structural Subsumption based Similarity Measure for the Description Logic ALE H
211
Knowledge Representation and Reasoning (KR’14),
Vienna, Austria. AAAI Press. Short Paper. To appear.
Ge, J. and Qiu, Y. (2003). Concept similarity match-
ing based on semantic distance. In Gottlob, G. and
Walsh, T., editors, Proceedings of the Forth Interna-
tional Conference on Semantics, Knowledge and Grid
(SKG 2008), pages 380–383. Morgan Kaufmann.
Ge, J. and Qiu, Y. (2008). Concept similarity matching
based on semantic distance. In SKG, pages 380–383.
IEEE Computer Society.
Giunchiglia, F., Yatskevich, M., and Shvaiko, P. (2007).
Semantic matching: Algorithms and implementation.
Journal of Data Semantics, 9:1–38.
Jaccard, P. (1901).
´
Etude comparative de la distribution
florale dans une portion des Alpes et des Jura. Bul-
letin del la Soci´et´e Vaudoise des Sciences Naturelles,
37:547–579.
Lehmann, K. and Turhan, A.-Y. (2012). A framework
for semantic-based similarity measures for ELH -
concepts. In del Cerro, L. F., Herzig, A., and Mengin,
J., editors, JELIA, volume 7519 of Lecture Notes in
Computer Science, pages 307–319. Springer.
Mathur, S. and Dinakarpandian, D. (2012). Finding disease
similarity based on implicit semantic similarity. Jour-
nal of Biomedical Informatics, 45(2):363–371.
Racharak, T. and Suntisrivaraporn, B. (2015). Similar-
ity measures for F L
0
concept descriptions from an
automata-theoretic point of view. In Information and
Communication Technology for Embedded Systems
(IC-ICTES), pages 1–6. IEEE Computer Society.
Suntisrivaraporn, B. (2013). A similarity measure for the
description logic EL with unfoldable terminologies.
In International Conference on Intelligent Networking
and Collaborative Systems (INCoS-13), pages 408–
413.
Tongphu, S. and Suntisrivaraporn, B. (2014). On desirable
properties of the structural subsumption-based simi-
larity measure. In Semantic Technology - 4th Joint In-
ternational Conference, JIST 2014, Chiang Mai, Thai-
land, November 9-11, 2014. Revised Selected Papers,
pages 19–32.
Tongphu, S. and Suntisrivaraporn, B. (2015). Algorithms
for measuring similarity between elh concept descrip-
tions: A case study on SNOMED CT. Journal of Com-
puting and Informatics. (Accepted in May 2015; To
appear).
Turhan, A.-Y. (2007). On the Computation of Common Sub-
sumers in Description Logics. PhD thesis, TU Dres-
den, Institute for Theoretical Computer Science, Ger-
many.
ICAART 2016 - 8th International Conference on Agents and Artificial Intelligence
212