A TWO-WAY APPROACH FOR PROBABILISTIC GRAPHICAL

MODELS STRUCTURE LEARNING AND ONTOLOGY

ENRICHMENT

Mouna Ben Ishak

, Philippe Leray

and Nahla Ben Amor

LARODEC Laboratory, ISG, University of Tunis, 2000 Tunis, Tunisia

Knowledge and Decision Team, LINA Laboratory UMR 6241, Polytech’Nantes, Nantes, France

Keywords:

Ontologies, Probabilistic graphical models (PGMs), Machine learning, Ontology enrichment.

Abstract:

Ontologies and probabilistic graphical models are considered within the most efﬁcient frameworks in knowl-

edge representation. Ontologies are the key concept in semantic technology whose use is increasingly preva-

lent by the computer science community. They provide a structured representation of knowledge characterized

by its semantic richness. Probabilistic Graphical Models (PGMs) are powerful tools for representing and rea-

soning under uncertainty. Nevertheless, both suffer from their building phase. It is well known that learning

the structure of a PGM and automatic ontology enrichment are very hard problems. Therefore, several algo-

rithms have been proposed for learning the PGMs structure from data and several others have led to automate

the process of ontologies enrichment. However, there was not a real collaboration between these two research

directions. In this work, we propose a two-way approach that allows PGMs and ontologies cooperation. More

precisely, we propose to harness ontologies representation capabilities in order to enrich the building process

of PGMs. We are in particular interested in object oriented Bayesian networks (OOBNs) which are an ex-

tension of standard Bayesian networks (BNs) using the object paradigm. We ﬁrst generate a prior OOBN

by morphing an ontology related to the problem under study and then, we describe how the learning process

carried out with the OOBN might be a potential solution to enrich the ontology used initially.

1 INTRODUCTION

Ontologies are the key concept in semantic web, they

allow logical reasoning about concepts linked by se-

mantic relations within a knowledge domain. Prob-

abilistic graphical models (PGMs), from their side,

provide an efﬁcient framework for knowledge rep-

resentation and reasoning under uncertainty. Even

though they represent two different paradigms, On-

tologies and PGMs share several similarities which

has led to some research directions aiming to combine

them. In this area, Bayesian networks (BNs) (Pearl,

1988) are the most commonly used. However, given

the restrictive expressiveness of BNs, proposed meth-

ods focus on a restrained range of ontologies and ne-

glect some of their components. To overcome this

weakness, we propose to explore other PGMs, signif-

icantly more expressive than standard BNs, in order

to address an extended range of ontologies.

We are in particular interested in object oriented

Bayesian networks (Bangsø and Wuillemin, 2000)

(OOBN), which are an extension of standard BNs.

In fact, OOBNs share several similarities with on-

tologies and they are suitable to represent hierarchi-

cal systems as they introduce several aspects of ob-

ject oriented modeling, such as inheritance. Our idea

is to deﬁne the common points and similarities be-

tween these two paradigms in order to set up a set of

mapping rules allowing us to generate a prior OOBN

by morphing ontology in hand and then to use it as a

starting point to the global OOBN learning algorithm,

this latter will take advantages from both semanti-

cal data, derived from ontology which will ensure its

good start-up and observational data. We then capi-

talize on the ﬁnal structure resulting from the learning

process to carry out the ontology enrichment. By this

way, our approach ensures a real cooperation, in both

ways, between ontologies and OOBNs.

The remainder of this paper is organized as fol-

lows: In sections 2 and 3 we provide a brief represen-

tation of our working tools. In section 4, we introduce

our new approach. In section 5, we look over the re-

lated work. The ﬁnal section summarizes conclusions

reached and outlines directions for future research.

189

Ben Ishak M., Leray P. and Ben Amor N..

A TWO-WAY APPROACH FOR PROBABILISTIC GRAPHICAL MODELS STRUCTURE LEARNING AND ONTOLOGY ENRICHMENT.

DOI: 10.5220/0003634801890194

In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2011), pages 189-194

ISBN: 978-989-8425-80-5

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

2 ONTOLOGIES

For the AI community, an ontology is an explicit spec-

iﬁcation of a conceptualization (Gruber, 1993). That

is, an ontology is a description of a set of represen-

tational primitives with which to model an abstract

model of a knowledge domain. Formally, we deﬁne

an ontology O = hC p, R , I , A i as follows:

• C p = {cp

, . . . cp

} is the set of n concepts

(classes) such that each cp

has a set of k prop-

erties (attributes) P

= {p

, . . . p

• R is the set of binary relations among elements of

C p which consists of two subsets:

– H

which describes the inheritance relations

among concepts.

– S

which describes semantic relations among

concepts. That is, each relation cp

∈ S

has cp

as a domain and cp

as a range.

• I is the set of instances, representing the knowl-

edge base.

• A is the set of the axioms of the ontology. A con-

sists of constraints on the domain of the ontology

that involve C p, R and I .

During the last few years, increasing attention has

been focused on ontologies and ontological engineer-

ing. By ontological engineering we refer to the set

of activities that concern the ontology life cycle, cov-

ering its design, deployment, up to its maintenance

which is becoming more and more crucial to ensure

the continual update of the ontology toward possible

changes. The ontology evolution process (Stojanovic

et al., 2002) is deﬁned as the timely adaptation of an

ontology in response to a certain change in the do-

main or its conceptualization. Evolution can be of

two types (Khattak et al., 2009):

• Ontology Population Process. consists in intro-

ducing new instances of the ontology concepts

and relations.

• Ontology Enrichment Process. consists in

adding (removing) concepts, properties and (or)

relations in the ontology or making some modi-

ﬁcations in the already existing ones because of

changes required in the ontology deﬁnition itself,

and then populate it for its instances. Ontology

enrichment techniques are automatic processes,

which generate a set of the possible modiﬁcations

on the ontology and propose these suggestions to

the ontology engineers.

This paper proposes to harness PGMs structure

deﬁnition in order to improve the process of ontolo-

gies enrichment. We are in particular interested in ob-

ject oriented Bayesian networks. Before introducing

our method, we give basic notions on this framework.

3 OBJECT ORIENTED BAYESIAN

NETWORKS

Probabilistic graphical models (PGMs) provide

an efﬁcient framework for knowledge represen-

tation and reasoning under uncertainty. Ob-

ject oriented Bayesian networks (OOBNs) (Bangsø

and Wuillemin, 2000) (Koller and Pfeffer, 1997)

are an extension of standard Bayesian networks

(BNs) (Pearl, 1988) using the object paradigm. They

are a convenientrepresentation of knowledge contain-

ing repetitive structures. So they are a suitable tool to

represent some special relations which are not obvi-

ous to represent using standard BNs (e.g., examine

a hereditary character of a person given those of his

parents). Thus an OOBN models the domain using

fragments of a Bayesian network known as classes.

Each class can be instantiated several times within

the speciﬁcation of another class. Formally, a class

T is a DAG over three, pairwise disjoint sets of nodes

, H

, O

), such that for each instantiation t of T:

• I

is the set of input nodes. All input nodes are

references to nodes deﬁned in other classes (called

referenced nodes).

• H

is the set of internal nodes including instan-

tiations of classes which do not contain instantia-

tions of T .

• O

is the set of output nodes. They are nodes from

the class usable outside the instantiations of the

class. An output node of an instantiation can be

a reference node if it is used as an output node of

the class containing it.

Internal nodes (expect classes instantiations) and

output nodes (except those which are reference nodes)

are considered as real nodes and they represent vari-

ables. In an OOBN, nodes are linked using either di-

rected links (i.e., links as in standard BNs) or refer-

ence links. The former are used to link reference or

real nodes to real nodes, the latter are used to link ref-

erence or real nodes to reference nodes. Each node

in the OOBN has its potential, i.e. a probability dis-

tribution over its states given its parents. To express

the fact that two nodes (or instantiations) are linked

in some manner, we can also use construction links

(− − −) which only represent a help to the speciﬁca-

tion.

When some classes in the OOBN are similar (i.e.

share some nodes and potentials), their speciﬁcation

KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development

190

(a) The standard BN. (b) The class deﬁnition. (c) The OOBN representation.

Figure 1: An OOBN example.

can be simpliﬁed by creating a class hierarchy among

them. Formally, a class S over (I

, O

, H

) is a sub-

class of a class T over (I

, O

, H

), if I

⊆ I

, O

⊆

and H

⊆ H

In the extreme case where the OOBN consists of a

class having neither instantiations of other classes nor

input and output nodes we collapse to standard BNs.

Example 1. In ﬁgure 1, assume that in the BN of 1(a)

and X

have the same state space and the condi-

tional probability tables (CPTs) associated with all

nodes labeled A

as well as nodes labeled B

, C

and

, where i = {1, 2, 3} are identical. Hence, we have

three copies of a same structure. Thus, when model-

ing an OOBN such a repetitive structure will be pre-

sented by a class 1(b), where the dashed node X is

the input node (an artiﬁcial node having the same

state space as X

and X

), the shaded nodes C and

D are output nodes, and A and B are the encapsu-

lated nodes. Thus, we can represent the BN of ﬁgure

(a) using an OOBN model (c). The class C is instan-

tiated three times and the nodes X

, X

, Y

and Y

are

connected to the appropriate objects labeled I.1, I.2

and I.3.

4 A NEW APPROACH FOR

OOBN-ONTOLOGY

COOPERATION: 2OC

In this section, we expose our two-way approach that

integrates the ontological knowledge in the OOBN

learning process by morphing an ontology into a prior

OOBN structure then, the ﬁnal structure derived from

the learning process is used to provide a set of possi-

ble extensions allowing the enrichment of the ontol-

ogy used initially.

4.1 The Morphing Process

We associate ontology concepts to classes of the

OOBN framework and concept properties to their sets

of random variables (real nodes). Concepts that are

connected by a subsumption relationship in the ontol-

ogy will be represented by a class hierarchy in the

prior OOBN, and semantic relations, which we as-

sume that they follow a causal orientation (Ben Ishak

et al., 2011), are used to specify classes interfaces and

instantiations organization in the OOBN.

To provide the morphing process, we assume that

the ontology conceptual graph is a directed graph

whose nodes are the concepts and relations (seman-

tic and hierarchical ones) are the edges. Our target

is to accomplish the mapping of this structure into

a prior OOBN structure while browsing each node

once and only once. To this end, we adapt the generic

Depth-First Search (DFS) algorithm for graph travers-

ing. The idea over the Depth-First Search algorithm is

to traverse a graph by exploring all the vertices reach-

able from a source vertex: If all its neighbors have al-

ready been visited (in general, color markers are used

to keep track), or there are no ones, then the algorithm

backtracks to the last vertex that had unvisited neigh-

bors. Once all reachable vertices have been visited,

the algorithm selects one of the remaining unvisited

vertices and continues the traversal. It ﬁnishes when

all vertices have been visited. The DFS traversal al-

lows us to classify edges into four classes that we use

to determine actions to do on each encountered con-

cept:

• Tree Edges: are edges in the DFS search tree.

They allow to deﬁne actions on concepts encoun-

tered for the ﬁrst time.

• Back Edges: join a vertex to an ancestor already

visited. They allow cycle detection, in our case

these edges will never be encountered. As our

edges respect a causal orientation having a cycle

of the form X

→ X

means that X

A TWO-WAY APPROACH FOR PROBABILISTIC GRAPHICAL MODELS STRUCTURE LEARNING AND

ONTOLOGY ENRICHMENT

191

the cause of X

which is the cause of X

so this lat-

ter cannot be the cause of X

at the same instant t

but rather at an instant t + ε. We are limited to on-

tologies that do not contain cycles, because such

relationships invoke the dynamic aspect which is

not considered in this work.

• Forward and Cross Edges: all other edges. They

allow to deﬁne actions to do on concepts that are

already visited crossing another path and so hav-

ing more than one parent.

A deep study of the similarities between on-

tologies and the OOBN framework and a more de-

tailed description of the morphing process can be ﬁnd

in (Ben Ishak et al., 2011).

4.2 The Learning Process

Few works have been proposed in the literature to

learn the structure (Bangsø et al., 2001) (Langseth

and Nielsen, 2003) of an OOBN. (Langseth and

Nielsen, 2003) proposed the OO-SEM algorithm

which consists of two steps. First, they take advan-

tages from prior information available when learn-

ing in object oriented domains. Thus, an expert is

asked about a partial speciﬁcation of an OOBN by

grouping nodes into instantiations and instantiations

into classes. Having this prior model, the second

step starts by learning the interfaces of the instanti-

ations. Then, the structure inside each class is learned

based on the candidate interfaces founded previously.

Thanks to the morphing process described above, we

are not in need of expert elicitation as our process al-

lowed us to take advantage of the semantic richness

provided by ontologies to generate the prior OOBN

structure. This later will be used as a starting point

to the second step which will be done as described

in (Langseth and Nielsen, 2003).

4.3 The Change Detection Process

To get the ﬁnal OOBN structure we have gathered

both ontology semantic richness and observational

data. Yet, in some cases, data may contradict the on-

tological knowledge which leads us to distinguish two

possible working assumptions:

• A Total Conﬁdence in the Ontology. Any con-

tradiction encountered during the learning pro-

cess is due to data. The conﬂict must be man-

aged while remaining consistent with the onto-

logical knowledge. The enrichment process will

be restrained to the addition of certain knowledge

(relations, concepts, etc.) while preserving the

already existing ones and what was regarded as

truth remains truth.

• A Total Conﬁdence in the Data. Any contra-

diction encountered during the learning process is

due to the ontology. The conﬂict must be man-

aged while remaining faithful to the data. In

this case, changing the conceptualization of the

ontology will be allowed, not only by adding

knowledge, but also by performing other possi-

ble changes, such as deleting, merging, etc. This

means that the data will allow us to get new truths

that may suspect the former. In the following we

will adopt this assumption.

As described above, the OO-SEM algorithm starts

by learning the interface of each class then learns the

structure inside each class. We can beneﬁt of these

two steps in order to improve the ontology granular-

ity. In fact, the ﬁrst step allows us to detect the possi-

ble existence of new relations and / or concepts which

might be added to the ontology at hand. The second

step may affect the deﬁnition of the already existing

concepts and / or relations.

4.3.1 Interfaces Learning vs Relations and/or

Concepts Adding or Removing

Remove Relations. When we have generated the

OOBN, concepts related by a semantic relations were

represented by instances of classes encapsulated in

each other, and we have expect that the learning pro-

cess will identify the variables that interact between

the two instances. If no common interface is identi-

ﬁed, then these two concepts should be independent.

So their semantic relation have to be checked.

Figure 2: Enrichment process: an example of removing a

relation (S

is a semantic relation). Here we suppose that

after performing the learning process, we didn’t ﬁnd nodes

from C

that reference nodes from C

. Thus, we can

propose to delete the semantic relation which appears in the

ontology.

Add Concepts/Relations. A class c

communicates

via its interface with a set S

of classes. If S

is a

singleton then we can simply propose to add a new

relation between concepts representing these classes

in the underlying ontology. Otherwise, we can use

KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development

192

this exchange between classes whether by translating

it into relations or concepts allowing the factorization

of some other ones already present in the ontology. If

some classes in S

similar sets of nodes and have sim-

ilar structures, then we can extract a super-concept

over them. Otherwise, these relations may be trans-

lated into relations in the ontology.

Figure 3: Enrichment process: an example of adding con-

cepts and relations (S

, S

and S

are semantic rela-

tions). Here we suppose that p

2.1

and p

4.2

have the same

state space and their respective classes C

and C

com-

municate with the same class C

. Thus, in the ontological

side, we can deﬁne a super-concept over cp

and cp

having

super

as property which substitute both p

2.1

and p

4.2

. Note

thatC

communicates also with C

, but as it doesn’t rep-

resent shared properties with the other classes, then we will

simply add a new relation between cp

and cp

in the on-

tology.

4.3.2 Classes Learning vs Concepts Redeﬁnition

As deﬁned above, each class c in the OOBN is a DAG

overits three sets of nodesI

, H

and O

. Suppose that

the result of the learning process, was a disconnected

graph (as the example of ﬁgure 4), this means that

, H

and O

are divided into multiple independent

connected components (two in ﬁgure 4). Thus, nodes

of each component are not really correlated with those

of the other components. This can be translated in

the ontological side by proposing to deconstruct the

corresponding concept into more reﬁned ones, where

each concept represents a component of the discon-

nected graph.

The possible changes are then communicated to

an expert via a warning system which detects the

changes and allows the expert to state actions to be

done. If he chooses to apply the change, he shall ﬁrst

denominate the discovered relations and / or concepts.

Figure 4: Enrichment process: an example of concept re-

deﬁnition.

5 RELATED WORK

In recent years, a panoply of works have been pro-

posed in order to combine PGMs and ontologies so

that one can enrich the other.

One line of research aims to extend existing on-

tology languages, such as OWL

, to be able to

catch uncertainty in the knowledge domain. Pro-

posed methods (Ding and Peng, 2004), (Yang and

Calmet, 2005), (Costa and Laskey, 2006) use ad-

ditional markups to represent probabilistic informa-

tion attached to individual concepts and properties in

OWL ontologies. Other works deﬁne transition ac-

tions in order to generate a PGM given an ontology

with the intention of extending ontology querying to

handle uncertainty while keeping the ontology for-

malism intact (Bellandi and Turini, 2009).

On the other hand, some solutions proposed the

use of ontologies to help PGMs construction. Some of

them are designed for speciﬁc applications (Helsper

and van der Gaag, 2002), (Zheng et al., 2008), while

some others give various solutions to handle this is-

sue (Fenz et al., 2009), (Ben Messaoud et al., 2011).

However, all these solutions are limited to a re-

strained range of PGMs, usually BNs. So, they ne-

glect some ontology important aspects such as repre-

senting concepts having more than one property, non

taxonomic relations, etc. Moreover, the main idea

behind these methods was to enhance ontology rea-

soning abilities to support uncertainty. However, we

cannot provide a good basis for reasoning while we

do not have a well deﬁned ontology, which takes into

consideration changes of its knowledge domain.

Thanks to the mapping process found between

ontologies and OOBNs, our 2OC approach allowed

us to deal with ontologies without signiﬁcant restric-

tions. We focused on concepts, their properties, hier-

archical as well as semantic relations and we showed

how these elements would be useful to automatically

generate a prior OOBN structure, this latter is then

learned using observational data. This mixture of se-

mantical data provided by the ontology and observa-

tional data presents a potential solution to discover

ttp://www.w3.org/TR/2004/REC-owl-features-

20040210/

A TWO-WAY APPROACH FOR PROBABILISTIC GRAPHICAL MODELS STRUCTURE LEARNING AND

ONTOLOGY ENRICHMENT

193

new knowledge which is not yet expressed by the on-

tology thus, our idea was to translate new relations

discovered by the learning process into knowledge

which may be useful to enrich the initial ontology.

6 CONCLUSIONS AND FUTURE

WORKS

In this paper, we have proposed a new two-way ap-

proach, named OOBN-Ontology Cooperation (2OC)

for OOBN structure learning and automatic ontology

enrichment. The leading idea of our approach is to

capitalize on analyzing the elements that are common

to both tasks with the intension of improving their

state-of-the-art methods. In fact, our work is consid-

ered as an initiative aiming to set up new bridges be-

tween PGMs and ontologies. The originality of our

method lies ﬁrst, on the use of the OOBN framework

which allowed us to address an extended range of on-

tologies, second, on its bidirectional beneﬁt as it en-

sures a real cooperation, in both ways, between on-

tologies and OOBNs.

Nevertheless, this current version is subject to sev-

eral improvements. As a ﬁrst line of research, we aim

to implement our method and test it on real world ap-

plications based on ontologies, furthermore, in this

work, our aim was to provide a warning system able

to propose a set of possible changes to the ontology

engineers. The discovered relations and / or con-

cepts have to be denominated so, as possible research

direction, we will be interested in natural language

processing (NLP) methods to allow the automation

of this process. Our last perspective concerns the

use of another PGM, Probabilistic Relational Models

(Getoor et al., 2007),whose characteristics are similar

to OOBNs.

REFERENCES

Bangsø, O., Langseth, H., and Nielsen, T. D. (2001). Struc-

tural learning in object oriented domains. In Proceed-

ings of the 14th International Florida Artiﬁcial Intel-

ligence Research Society Conference, pages 340–344.

Bangsø, O. and Wuillemin, P.-H. (2000). Object oriented

bayesian networks: a framework for top-down spec-

iﬁcation of large bayesian networks with repetitive

structures. Technical report, Aalborg University.

Bellandi, A. and Turini, F. (2009). Extending ontology

queries with bayesian network reasoning. In Proceed-

ings of the 13th International Conference on Intelli-

gent Engineering Systems, pages 165–170, Barbados.

Ben Ishak, M., Leray, P., and Ben Amor, N. (2011).

Ontology-based generation of object oriented

bayesian networks. In Proceedings of the 8th

Bayesian Modelling Applications Workshop, pages

12–20, Barcelona, Spain.

Ben Messaoud, M., Leray, P., and Ben Amor, N. (2011).

Semcado: a serendipitous strategy for learning causal

bayesian networks using ontologies. In Proceedings

of the 11th European Conference on Symbolic and

Quantitative Approaches to Reasoning with Uncer-

tainty, pages 182–193, Belfast, Northern Ireland.

Costa, P. and Laskey, K. (2006). PR-OWL: A framework

for probabilistic ontologies. Frontiers In Artiﬁcial In-

telligence and Applications, 150:237–249.

Ding, Z. and Peng, Y. (2004). A probabilistic extension to

ontology language OWL. In Proceedings of the 37th

Hawaii International Conference On System Sciences.

Fenz, S., Tjoa, M., and Hudec, M. (2009). Ontology-based

generation of bayesian networks. In Proceedings of

the 3rd International Conference on Complex, Intelli-

gent and Software Intensive Systems, pages 712–717,

Fukuoka, Japan.

Getoor, L., Friedman, N., Koller, D., Pfeffer, A., and Taskar,

B. (2007). Probabilistic relational models. In Getoor,

L. and Taskar, B., editors, An Introduction to Statisti-

cal Relational Learning, pages 129–174. MIT Press.

Gruber, T. (1993). A translation approach to portable ontol-

ogy speciﬁcations. Knowledge Acquisition, 5(2):199–

220.

Helsper, E. M. and van der Gaag, L. C. (2002). Building

bayesian networks through ontologies. In Proceedings

of the 15th European Conference on Artiﬁcial Intelli-

gence, pages 680–684, Amsterdam.

Khattak, A. M., Latif, K., Lee, S., and Lee, Y.-K. (2009).

Ontology evolution: A survey and future challenges.

In The 2nd International Conference on u-and e- Ser-

vice, Science and Technology, pages 285–300, Jeju.

Koller, D. and Pfeffer, A. (1997). Object-oriented bayesian

networks. In Proceedings of the 13th Conference on

Uncertainty in Artiﬁcial Intelligence, pages 302–313,

Providence, Rhode Island, USA. Morgan Kaufmann.

Langseth, H. and Nielsen, T. D. (2003). Fusion of domain

knowledge with data for structural learning in object

oriented domains. Journal of Machine Learning Re-

search, 4:339–368.

Pearl, J. (1988). Probabilistic reasoning in intelligent sys-

tems. Morgan Kaufmann, San Franciscos.

Stojanovic, L., Maedche, A., Stojanovic, N., and Motik, B.

(2002). User-driven ontology evolution management.

In Proceedings of the 13th International Conference

on Knowledge Engineering and Knowledge Manage-

ment, pages 285–300, Siguenza, Spain.

Yang, Y. and Calmet, J. (2005). Ontobayes: An ontology-

driven uncertainty model. In The International Con-

ference on Intelligent Agents, Web Technologies and

Internet Commerce, pages 457–463, Vienna, Austria.

Zheng, H., Kang, B.-Y., and Kim, H.-G. (2008). An

ontology-based bayesian network approach for rep-

resenting uncertainty in clinical practice guidelines.

In ISWC International Workshops, URSW 2005-2007,

Revised Selected and Invited Papers, pages 161–173.

KEOD 2011 - International Conference on Knowledge Engineering and Ontology Development

194