Measuring Design Complexity of Cultural Heritage Ontologies
Bilal Ben Mahria, Ilham Chaker and Azeddine Zahi
Dept. of Computer Science, Faculty of Science and Technology, BP.2202 Imouzzer Street, Fez, Morocco
Keywords: Cultural Heritage, Size of Vocabulary, Tree Impurity, Coupling, Maintenance, Ontology Evaluation.
Abstract: Nowadays, Ontologies have become widely used to design formalism for knowledge representation, and are
considered as the foundation for the Semantic Web. However, with their widespread usage, a question of their
complexity evaluation increased even more, especially in some domains that currently know a cruise number
of ontologies like Cultural Heritage. In this paper, we present an analysis of the advanced metrics for
measuring the design complexity of existing cultural heritage ontologies (CH). In this context, the main goals
of this study are to (i) present advanced metrics such as the size of vocabulary, the tree impurity, coupling,
average number of path per concept, and average path length, in order to analyze the advanced complexity
features of the CH ontologies and their impact on the reuse and evolution of the CH ontologies; (ii) Help
developers to decide whether the ontology is over complex that it needs some simplification or re-building;
(iii) Make developers clearly realize the impact of the size and scale of ontology. In order to reach these goals,
a set of twenty CH ontologies are gathered from the web to measure and analyze their advanced complexity
metrics. By analyzing the size of vocabulary, the average number of paths per concept, and average path
length, the evaluation results exhibit that the CH ontologies studied are highly complex. In addition, the CH
ontologies cannot be easily maintained due to the findings reached through the analysis of the tree impurity
and coupling.
1 INTRODUCTION
An ontology has been previously defined as a formal,
explicit specification of a shared conceptualization
where the conceptualization in this context refers to
the abstraction of a domain of knowledge(Guarino &
Poli, 1993). This abstraction is increasingly used in
various fields such as data exchange, data integration,
and the biggest of which is the semantic
web(Maedche & Staab, 2001). This apparent increase
in the use of ontologies has procured an increase in
the number of ontologies in existence which in turn
has promoted the need for evaluating the ontologies.
Ontology evaluation is an important issue that
must be addressed in many situations. For instance,
during the process of developing an ontology, the
evaluation is important to guarantee that what is built
meets the application requirements. Generally, the
ontology evaluation is defined as the process of
measuring the quality of an ontology with regard to a
set of criteria that consist of determining which in a
collection of ontologies would suit a particular
purpose(Brank et al., 2005). In addition, an important
definition of ontology evaluation has been suggested
by (Gómez-Pérez, 2004) and later echoed by
(Vrandečić, 2009). In these works, the evaluation
process is categorized into two major areas:
Verification and Validation. The former is concerned
with building an ontology correctly by measuring the
accuracy, completeness, conciseness and consistency
metrics, etc. The latter, on the other hand, is about
building the correct ontology by checking the quality
of the ontology design. The ontology design is
commonly referred to as the ontology complexity.
Generally Speaking, measuring the complexity of
an ontology gives some insight for developers to help
them better understand, reuse, reduce maintenance
requirements and integrate ontologies, as well as help
users to select the ontology that meets their needs
best. In fact, the complexity of ontology increases as
the ontology grows in size and as ontology evolves,
the management of the complexity and the
maintenance increases. Therefore, as ontologies grow
in size and numbers, it is important to measure their
complexity quantitatively. It is well known that “you
cannot control what you cannot measure”(DeMarco,
1982). Quantitative measurement of complexity can
help ontology developers and maintainers better
Ben Mahria, B., Chaker, I. and Zahi, A.
Measuring Design Complexity of Cultural Heritage Ontologies.
DOI: 10.5220/0010016501330140
In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - Volume 2: KEOD, pages 133-140
ISBN: 978-989-758-474-9
Copyright
c
2020 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
133
understand the current status of the ontology, better
evaluate its design and control its development
process. Nowadays, one of the active areas of the
ontology development is the cultural heritage domain
where a large number of ontologies are being
developed to study memory organizations that
includes libraries, archives, and museums of different
kinds specializing in particular areas of CH, such as
museums, archaeological museums, cultural history
museums, and science museums, etc (Doerr, 2009;
Hyvönen, 2009).
In brief, Cultural Heritage (CH) refers to the
legacy of physical objects, environment, traditions,
and knowledge of a society that are inherited from the
past, maintained and developed further in the present,
and preserved (conserved) for the benefit of future
generations. The vital importance of preserving
cultural heritage for the populations, has led to an
increased number of ontologies in this domain. Thus,
these ontologies can be grouped into six categories:
General Concept Ontologies, Actor Ontologies, Place
Ontologies, Time and period ontologies, Event
Ontologies and Domain Nomenclatures or
terminologies (Hyvönen, 2012). In this context, the
evaluation of the existing CH ontologies becomes a
necessity.
Although few studies have been conducted on the
assessment of this cultural content (Nafis et al., 2019;
Orme et al., 2006; Zhe et al., 2006), there are still
many issues that have not been sufficiently addressed.
In this regard, the main goals of this paper are to: (i)
Present advanced metrics such as the size of
vocabulary, the tree impurity, coupling, average
number of path per concept, and average path length
in order to discuss the advanced complexity features
of the CH ontologies and their impact on the reuse
and evolution of these ontologies. (ii) Help
developers to decide whether the ontology is over
complex that it needs some simplification or re-
building. (iii) Make developers clearly realize the
impact of the size and scale of ontology.
To the best of our knowledge, there is a shortage
of studies which focus on the analysis of the quality
of CH ontologies to consolidate their reuse,
maintenance and evolution. In fact, this work
attempts to fill this gap by identifying and evaluating
existing CH ontologies on the web. A set of 20
ontologies of the CH domain are downloaded on the
web and a set of quantitative quality metrics adopted
and combined from different works (Orme et al.,
2006; Ouyang et al., 2011; Tartir et al., 2010; Zhang
et al., 2010; Zhe et al., 2006) are applied to evaluate
the ontology based on the complexity features. The
experimental results show that the majority of the CH
ontologies are highly complex and cannot be easily
maintained.
The outline of this paper is demonstrated as
follows. In Sect. 2, we present the related work, which
describes the most popular works that studied the
assessment of the cultural heritage ontologies. In
Sect. 3, we detail some challenges and limitations of
the cultural heritage domain. In Sect. 4, we outline
some common Formal notations. In section 5, we
describe the advanced features metrics to analyze the
complexity of the cultural heritage ontologies.
Section 6 is devoted to introducing the experiment
studies and discussions. Finally, Sect. 7 concludes the
paper and suggests directions for future works.
2 RELATED WORKS
Considerable amounts of studies have been
conducted on measuring the ontologies complexity.
With regard to the CH domain, there is a lack of
studies that are addressed to measure the complexity
of the Cultural heritage ontologies(Nafis et al., 2019;
Orme et al., 2006; Zhe et al., 2006). (Nafis et al.,
2019) did a study to enable users to select suitable CH
ontologies for use when building applications that
integrate Cultural heritage content. (Orme et al.,
2006) measure the ontology complexity using a single
metric that is coupling. Inspired from the principles
of the object oriented class diagram (Nikiforova et al.,
2011), (Zhe et al., 2006) used three metrics called the
number of root classes, the number of leaf class, and
the average depth of inheritance tree to measure the
CH ontology complexity. However, these studies
suffer from one of the following limitations. First,
they confused the validation of the ontology with its
verification (Nafis et al., 2019). Second, they relied
on primitive metrics (such as number of classes,
number of properties, instances, root and leaf classes,
etc.) in order to study the design of the ontology(Nafis
et al., 2019; Zhe et al., 2006). Indeed, it is
meaningless to measure the design of the ontology by
using only primitive metrics as we will argue in this
work. Third, they consider ontology complexity as a
one-dimensional construct, which is based on class-
level metrics, while the complexity cannot be
measured directly using single level metrics (Nafis et
al., 2019; Orme et al., 2006; Zhe et al., 2006). Finally,
(Nafis et al., 2019)take into consideration the
extensional (Number of instances) level of the
ontology to study the complexity while the
complexity must be measured based on the
intentional level of the ontology and the extensional
level must be ignored .
KEOD 2020 - 12th International Conference on Knowledge Engineering and Ontology Development
134
Based on our knowledge, this work represents the
first study of the evaluation design complexity of
ontologies in the Cultural Heritage domain. This
evolution is based on some Advanced Complexity
Metrics.
3 CULTURAL HERITAGE (CH)
DOMAIN CHALLENGES
3.1 Cultural Heritage Domain
In a narrower sense, we may regard the cultural
heritage as the things protected by the memory
institutions such as museums, sites and monuments
records (“SMR”), archives and libraries. Their
international umbrella organizations are: the
International Council of Museums (ICOM
1
) the
International Federation of Library Associations
(IFLA
2
) and the International Council of Archives
(ICA
3
). They maintain their specific documentation
policies and standards. CH can be divided into three
subareas(Hyvönen, 2012):
Tangible Cultural Heritage consists of concrete
cultural objects, such as artifacts, works of art,
buildings, and books(Vecco, 2010).
Intangible Cultural Heritage includes phenomena
such as traditions, language, handicraft skills,
folklore, and knowledge (Vecco, 2010).
Natural Cultural Heritage consists of culturally
significant landscapes, biodiversity, and geodiversity
(Harrison, 2015).
3.2 CH Ontologies Types by Major
Domains
Major ontology types needed in CH applications can
be classified by their domain of discourse as
follows.
General Concept Ontologies. These ontologies
include general concepts, such as object types
(chair, painting, book, etc.) or materials (steel, wool,
wood, etc.). Concepts in keyword thesauri
typically fall in this category, excluding free
keywords, such as place and person names (Hyvönen,
2012).
Actor Ontologies. These ontologies encompass a set
of individual persons, organizations, and
groups. In libraries, actor ontologies are called
authority files(Hyvönen, 2012).
1
http://www.icom.org
2
http://www.ifla.org
Place Ontologies. These ontologies contain lists of
individual places. In land surveying, place
ontologies are called gazetteers(Hyvönen, 2012).
Time and Period Ontologies. Time ontologies
identify the way in which time is exemplified,
and may list particular periods of time for shared
reference, such as “18th century,” “‘Iron
Age,” “Almohad Period,” etc(Hyvönen, 2012).
Event Ontologies Events are the semantic key that
associates actors, objects, places, and time
together. Event ontologies are repositories for listing
references to individual’s events, such
as “Battle of Rio Salado” or “Independence of
Morocco,” so that they can be referred to in different
metadata records for interoperability (Hyvönen,
2012).
Domain Nomenclatures or Terminologies. Various
areas use particular nomenclatures, which roughly
match to free keywords of thesauri. For example,
there are name lists and taxonomies for plants and
animals, minerals, chemical compounds, diseases,
medicines, trademarks, etc(Hyvönen, 2012).
3.3 CH Domain Challenges
CH collection data has many specific characteristic
features, such as the following (Koch et al., 2019).
Multi-format. The contents are provided in different
forms, such as text documents, images,
audio tracks, videos, collection items, and learning
objects.
Multi-topical. The contents are attached to various
topics, such as art, history, artifacts, and traditions.
Multi-lingual. The content is available in different
languages.
Multi-cultural. The content is linked and explained
in terms of different cultures, such as
religions or national traditions in the West and East.
Multi-targeted. The contents are often addressed to
both laymen and experts, young and old.
The fundamental problem area in dealing with CH
data is to make the content mutually interoperable so
that it can be searched, linked, and presented in a
harmonized way across the outlines of the datasets
and data silos. In fact, the major reason for
interoperability problems in CH content publishing is
the Multi-Organizational nature in which CH content
is collected, maintained, and published. The content
with their own established standards and best
practices, by media organizations, and cultural
associations. In fact, ontologies provide a perfect
3
http://www.ica.org
Measuring Design Complexity of Cultural Heritage Ontologies
135
mechanism to bypass all these limitations(Hyvönen,
2012).
4 FORMAL NOTATION AND
THE GRAPH-CENTRIC
REPRESENTATION OF
ONTOLOGIES
4.1 Graph-Centric Representation of
Ontologies.
In order to present the ontology complexity metrics,
we provide a graph-centric view for OWL
ontologies(Zhe et al., 2006). More precisely, an
ontology can be seen as a directed labelled graph 𝐺
𝑁,𝑃,𝐸
where 𝑁 a set of nodes representing classes
and individuals; 𝑃 is a set of nodes representing
properties; and 𝐸 is a set of edges representing
property instances and other relationships between
nodes in the graph 𝐺. 𝐸⊆𝑁𝑃𝑁. 𝑁 includes
both 𝑁
(Named Classes and Individuals) and
𝑁
(Anonymous classes and individuals). 𝑃 contains
𝑃
(user defined Properties) and 𝑃
(OWL/RDFS
properties such as rdfs:subClassOf and
owl:disjointWith ).
The inheritance hierarchy of an ontology can be
described as 𝐺
𝑁
,𝑃
,𝐸
, where 𝑁
is the set of
nodes representing classes
𝑃
is the RDF property
rdfs:subClassOf
and 𝐸
is the set of edges
representing the inheritance relationship
(rdfs:subClassOf) among classes (Zhang et al., 2010).
4.2 Common Formal Notation
We use the following formal notation to represent
some terms that we will need for discussing the
complexity metrics. Small letters are used to identify
the notations related to concepts and relations, while
capital letters are used to identify the terminology
related to ontology and some metrics(Zhe et al.,
2006).
𝐶
𝑐
,𝑐
,𝑐
,…,𝑐
: The set of 𝑚 classes
defined in the ontology.
𝑅
𝑟
,𝑟
,𝑟
…,𝑟
: The set of relations each
class has.
𝑃
𝑝
,𝑝
,𝑝
,…,𝑝
: The set of paths each
class has. In fact, a different path has its own length,
thus the path length is defined as the sum of relations
on the path.
𝑝𝑙
𝑝𝑙
,
,𝑝𝑙
,
,…,𝑝𝑙
,
: represent the set of
path length of class 𝑐
. Path length of a particular
class states that the semantic distance between the
class and the general class. Therefore, the set of path
length of all classes in ontology is presented as: 𝑃𝐿
𝑝𝑙
,
,…,𝑝𝑙
,
,,𝑝𝑙
,
,…,𝑝𝑙
,
.
5 ADVANCED COMPLEXITY
METRICS OF ONTOLOGY
The advanced complexity metrics used in this work
includes: size of Vocabulary (SOV), Tree impurity,
the average number of paths per concept, the average
path length of ontology and coupling(Zhang et al.,
2010).
5.1 The Size of Vocabulary (SOV)
SOV measures the size of the vocabulary using
primitive metrics such as number of class. Given a
graph 𝐺
𝑁,𝑃,𝐸
of an ontology, 𝑆𝑂𝑉 is defined
as the sum of the named classes and named
individuals (𝑁
) and user defined properties (𝑃
):
𝑆𝑂𝑉
|
𝑁
|
|𝑃
|
(1)
A higher 𝑆𝑂𝑉 implies that the ontology is big in
size and would require a lot of time and effort to build
and maintain it.
5.2 Tree Impurity (TIP)
This metric is used to measure how far an ontology
inheritance hierarchy 𝐺
𝑁
,𝑃
,𝐸
deviates from
a tree (Zhang et al., 2010). It is defined as:
𝑇𝐼𝑃
|
𝐸
|
|
𝑁
|
1
(2)
Where 𝐸
is the number of rdfs:subClassOf edges
and 𝑁
is the number of nodes (including both named
and anonymous) in an ontology’s inheritance
hierarchy. The greater the TIP, the more an
ontology’s inheritance hierarchy deviates from a pure
tree structure, and the greater the complexity of an
ontology. A 𝑇𝐼𝑃 0 means that the inheritance
hierarchy is a tree.
5.3 Average Number of Paths per
Concept
The average number of paths per concept ( 𝜌 ) -
indicates the average connectivity degree of a concept
to the root concept in the ontology inheritance
hierarchy(Lourdusamy & John, 2018). It is defined as
KEOD 2020 - 12th International Conference on Knowledge Engineering and Ontology Development
136
ratio of the total number of path on the total number
of classes (𝑚):
𝜌
𝑝

𝑚
(3
)
For any ontology,𝜌 must be greater than or equal
to 1 (each concept must have a parent except for the
general concept). If 𝜌 1, then the ontology is a tree
(each concept has a single parent, and thus a single
path to the most general concept). An ontology with
a higher 𝜌 states that changes in a class would have a
large impact on its subclasses (each concept has
multiple parents, and thus multiple paths to the most
general concept).
5.4 The Average Path Length
The average path length (Λ
) indicates the average
number of concepts in a path in the ontology(Zhe et
al., 2006). It is defined as:
Λ
∑∑
𝑝𝑙
,


𝑝

(4
)
This metric is obtained from the ratio of the sum
of the path lengths (𝑝𝑙
,
) of each of the 𝑚 concepts
in the ontology over the sum of the number of paths
( 𝑝
) of concepts. An ontology with a bigger Λ
indicates that there are too many inheritance
relationships in the ontology; as a consequence, the
management and manipulation of concepts in such
ontology could be a complex task.
5.5 Coupling
Coupling reflects the number of external classes from
imported ontologies that are referenced in the intern
(local) ontology. Similar to measuring the software
modules coupling metrics, coupling of ontologies
measures the relatedness of the local ontology with
other existing ontologies or vocabularies that are used
for building this ontology(Ouyang et al., 2011). It is
defined as:
𝐶𝑜𝑢𝑝
𝑂
𝑅𝑒𝑓𝑂
𝑁𝐸𝐶𝑂
(5
)
Where 𝑅𝑒𝑓𝑂
the number of external classes is
referenced and 𝑁𝐸𝐶𝑂
represents the number of
external classes. The stronger coupling in ontologies,
the more difficult to understand, maintain, and more
complex the systems that use these ontologies.
6 EXPERIMENTS SETUP
A set of appropriate experiments have been arranged
in order to study the complexity of the well-known
selected Cultural Heritage ontologies. The detailed
information of these datasets is summarized in Table
1. As proof of concept, the advanced complexity
metrics are computed using Java OWL API (Horridge
& Bechhofer, 2011). Finally, it is important to note
that all the experimental simulations were conducted
on a personal computer under Windows 10, with intel
core i7 2.70 GHZ processor and 16 GB RAM.
6.1 Datasets
The dataset is composed of 20 ontologies of the CH
domain. Each ontology in the dataset is assigned an
index 𝑂
, 1𝑖20 to facilitate its reference in
the discussion. Table 1 shows the list of ontologies in
the dataset with their names and web links. The XML
files are web documents that include the RDF/OWL
files of the corresponding ontologies.
Table 1: The studied CH ontologies.
Index Ontology Category Web Link
𝑂
FRBR Actor https://vocab.org/frbr/core
𝑂
Hico Event http://hico.sourceforge.net/
𝑂
Bio Actor https://vocab.org/bio/
𝑂
Cito Event https://w3id.org/spar/cito/
𝑂
Pro Event https://w3id.org/spar/pro/
𝑂
bibo Actor http://bibliontology.com/
𝑂
Fabio Actor https://lov.linkeddata.es/dataset
𝑂
Cidoc Event+Actor+Palce http://www.cidoc-crm.org/
𝑂
Cultur Event https://lov.linkeddata.es/dataset/
𝑂

CulturalOn Event https://lov.linkeddata.es/dataset/
𝑂

Event Event https://lov.linkeddata.es/dataset/
𝑂

SEAS Event https://lov.linkeddata.es/dataset/l
𝑂

SEM Event https://lov.linkeddata.es/dataset/l
𝑂

Tp Place https://lov.linkeddata.es/dataset/l
𝑂

DOLCE Event http://www.ontologydesignpatter
𝑂

GVP Event+Place https://lov.linkeddata.es/dataset/l
𝑂

Ctlog Event https://lov.linkeddata.es/dataset/l
𝑂

Cdesc Event+Place+Actor https://lov.linkeddata.es/dataset/l
𝑂

Drammar Event+Actor+Place https://lov.linkeddata.es/dataset/l
𝑂

ddesc Event+Place+Actor https://lov.linkeddata.es/dataset/l
6.2 Primitive Metrics
In order to calculate the advanced complexity metrics
for all the ontologies in the dataset, it was
necessary to specify the basic semantic characteristics
of these ontologies such as the number of classes,
properties(Datatype Properties and Object Properties)
and instances. Overall, Figure 1 shows that the
Measuring Design Complexity of Cultural Heritage Ontologies
137
majority of selected ontologies for this study had a
high number of primitive metrics.
6.3 Experimental Result and
Discussions
The main goal of this work is to analyze the advanced
complexity features of the CH ontologies and their
impact on the ontology evolution and reuse. In this
context, each one of the advanced complexity metrics
is calculated and discussed in the following sections.
Figure 1: The primitive metrics of the studied CH
ontologies.
6.3.1 Size of the Vocabulary (SOV)
Figure 2 presents the results of the SOV the
measurement for all ontologies in the dataset. The
SOV ranges from 11 to 655, showing different
amounts of vocabulary used. The majority of
ontologies have a SOV between 50 and 700, followed
by those with a SOV between 20 and 40. These results
indicate that it would be beneficial for semantic web
developers in the CH domain to consider the reuse of
these ontologies (bibliographic ontology ( 𝑂
,
SOV=374), Event ontology (𝑂
, SOV=454), Place
ontology ( 𝑂

and 𝑂

, SOV=430 and 372) and
General ontology (𝑂

, SOV=655)) rather than trying
to build new related ontologies de novo. SOV of these
ontologies also states that they would require a larger
amount of time and effort to re-build and
maintain(Zhang et al., 2010).
Figure 2: The Size of Vocabulary of the studied ontologies.
6.3.2 Average Path Length (𝚲
), and Average
Number of Paths per Concept (𝝆)
Figure 3: the average path length and number of Paths per
concepts.
Figure 3 presents a joint analysis of the average path
length of the ontology and the average number path
per concept. These two metrics for all ontologies are
grouped into 2 ranges. The two ranges for Λ
are: 0
Λ
10 and 10 Λ
50. Figure 3(a) depicts that
the majority of ontologies have Λ
in the range 0
Λ
50, while others in the first range. This indicates
that the ontologies have a high Λ
. Therefore, a greater
Λ
shows that the class resides deeper in the
inheritance hierarchy and reuse more information
from its ancestors such as
𝑂
18
,
𝑂
,
𝑂
, etc. A high Λ
also states that the class is more difficult to maintain
as it is likely to be affected by changes in any of its
ancestors. In the same context, the two ranges for 𝜌
are: 𝜌1 and 𝜌 1. From the analysis of the value
of the 𝜌 (Figure 3 (b)), one can confirm that the most
of ontologies in the datasets have multiple path from
the root class to a given class, which indicates that
nearly all ontologies relies on the network (graph)
model type rather than hierarchical model type
(Tree)(Baliyan & Kumar, 2016). More precisely,
ontologies with small 𝜌 (𝜌1) indicate that changes
in a class would have a less impact on its
subclasses(Zhe et al., 2006).
6.3.3 Tree Impurity (TIP)
Figure 4 presents results of computing TIP for all the
ontologies in the dataset. It is noticed that an
important number of ontologies have TIP between
KEOD 2020 - 12th International Conference on Knowledge Engineering and Ontology Development
138
100 and 2500 followed by those with TIP below 40.
This empirical result shows that the most of
ontologies adopt multiple inheritance (TIP>>0) and
their inheritance hierarchy deviates heavily from a
pure tree structure (
𝑂
=1377,
𝑂

𝑂
=2295,
𝑂

=2446, etc.). Specifically, this indicate that this
ontologies cannot be easily maintained except
𝑂
,
𝑂

,
𝑂

which have TIP 42,21,20,29 respectively.
Figure 4: The Tree Impurity of the studied ontologies.
6.3.4 Coupling
Considering the result presented in Figure 5, It is
clearly seen that the coupling of the following
ontologies:
𝑂
,
𝑂
,
𝑂

,
𝑂

,
𝑂

,
𝑂

,
𝑂

,
𝑂

is
greater than 0.5 which indicates that all these
ontologies are related with other existing
vocabularies and contain a high number of external
classes and references to external classes. For
instance the coupling of CIDOC-CRM is 0.88, this
value states that the
𝑂
has a strong coupling.
Therefore, the ontologies with a strong coupling are
the more difficult to understand and maintain(Orme
et al., 2006).
Figure 5: The coupling measure of the studied ontologies.
Broadly speaking, by using these metrics, we find that
the large size of vocabulary, bigger average Number
of Paths per concept (𝜌) and average path length (Λ
)
indicate that CH ontologies in the dataset are highly
complex. Therefore, it would be advised to consider
the reuse and sharing of these ontologies rather than
trying to build similar ontologies from scratch. By
means of the TIP metric, the ontology engineer can
check if the design of the ontology follows good
classification principles. Through the coupling
metric, the ontology engineer can check the
relatedness of the local ontology with external
ontologies. Therefore, the analysis of the TIP and
coupling revealed that the majority of studied CH
ontologies can be difficult to maintain.
One may perceive that the larger the number of
classes, properties, and axioms is a strong point to
study the ontology complexity and they consider that
the larger number of primitive metrics, the more
complex an ontology is. However, we will argue that
it is very difficult to measure ontology complexity
with primitive metrics. Take the Fabio ontology as an
example. It is one of the largest ontologies with 261
classes. Another ontology, the CIDOC-CRM (169
classes) that contains a number of classes less than
Fabio ontology. However, the empirical result shows
that CIDOC-CRM has a large TIP compared to Fabio
ontology TIP (CIDOC-CRM TIP = 2295, Fabio TIP
=1377. In addition, the coupling metric exhibits that
CIDOC-CRM has a stronger coupling (Coupling =
0.87) than FABIO ontology (Coupling = 0.37). In
other words, we cannot use the primitive metrics to
measure the ontology complexity aspect in order to
achieve more complete understanding of these
ontologies.
7 CONCLUSION
In this paper, we have provided an analysis of some
advanced complexity metrics of 20 cultural heritage
ontologies. These metrics encompass the size of
vocabulary (SOV), the tree impurity (TIP), coupling,
average path length (Λ
), and average Number of Paths
per Concept (𝜌). This empirical evaluation shows that
the provided metrics can differentiate ontologies with
distinct degree of complexity. The metrics could
serve as ‘‘indicators” of the ontology complexity,
helping ontologist to understand the development
status, gain an overall picture of ontology complexity,
and identify potential problematic areas. The
evaluation result portrays that the majority of these
ontologies have large SOV, bigger average path
length and average number of paths per concept.
These findings indicate that the CH ontologies in the
dataset are highly complex. In this context, it is better
to consider the reuse and sharing of these ontologies
in the CH domain rather than trying to build similar
ontologies from scratch. Furthermore, the analysis of
TIP and coupling reveals that the studied CH
Measuring Design Complexity of Cultural Heritage Ontologies
139
ontologies cannot be easily maintained. In the future,
we plan to develop a system for distinguishing the
ontologies based on their level of complexity. We will
then further study the correlation between the
ontology validation (Complexity) and the ontology
verification (Correctness).
REFERENCES
Baliyan, N., & Kumar, S. (2016). A behavioral metrics suite
for modular ontologies. Proceedings of the Second
International Conference on Information and
Communication Technology for Competitive
Strategies, 1–4.
Brank, J., Grobelnik, M., & Mladenic, D. (2005). A survey
of ontology evaluation techniques. Proceedings of the
conference on data mining and data warehouses
(SiKDD 2005), 166–170.
DeMarco, T. (1982). Controlling software projects:
Management, measurement & estimation (Vol. 1133).
Yourdon Press New York.
Doerr, M. (2009). Ontologies for cultural heritage. In
Handbook on ontologies (p. 463–486). Springer.
Gómez-Pérez, A. (2004). Ontology evaluation. In
Handbook on ontologies (p. 251–273). Springer.
Guarino, N., & Poli, R. (1993). Toward principles for the
design of ontologies used for knowledge sharing. In
Formal Ontology in Conceptual Analysis and
Knowledge Representation, Kluwer Academic
Publishers, in press. Substantial revision of paper
presented at the International Workshop on Formal
Ontology.
Harrison, R. (2015). Beyond “natural” and “cultural”
heritage : Toward an ontological politics of heritage in
the age of Anthropocene. Heritage & Society, 8(1), 24–
42.
Horridge, M., & Bechhofer, S. (2011). The owl api : A java
api for owl ontologies. Semantic web, 2(1), 11–21.
Hyvönen, E. (2009). Semantic portals for cultural heritage.
In Handbook on ontologies (p. 757–778). Springer.
Hyvönen, E. (2012). Publishing and using cultural heritage
linked data on the semantic web (Vol. 3). Morgan &
Claypool Publishers.
Koch, I., Freitas, N., Ribeiro, C., Lopes, C. T., & da Silva,
J. R. (2019). Knowledge Graph Implementation of
Archival Descriptions Through CIDOC-CRM.
International Conference on Theory and Practice of
Digital Libraries, 99–106.
Lourdusamy, R., & John, A. (2018). A review on metrics
for ontology evaluation. 2018 2nd International
Conference on Inventive Systems and Control (ICISC),
1415–1421.
Maedche, A., & Staab, S. (2001). Ontology learning for the
semantic web. IEEE Intelligent systems, 16(2), 72–79.
Nafis, F., Yahyaouy, A., & Aghoutane, B. (2019).
Ontologies for the classification of cultural heritage
data. 2019 International Conference on Wireless
Technologies, Embedded and Intelligent Systems
(WITS), 1–7.
Nikiforova, O., Sejans, J., & Cernickins, A. (2011). Role of
UML class diagram in object-oriented software
development. Applied Computer Systems, 44(1), 65–74.
Orme, A. M., Tao, H., & Etzkorn, L. H. (2006). Coupling
metrics for ontology-based system. IEEE software,
23(2), 102–108.
Ouyang, L., Zou, B., Qu, M., & Zhang, C. (2011). A method
of ontology evaluation based on coverage, cohesion and
coupling. 2011 Eighth International Conference on
Fuzzy Systems and Knowledge Discovery (FSKD), 4,
2451–2455.
Tartir, S., Arpinar, I. B., & Sheth, A. P. (2010). Ontological
evaluation and validation. In Theory and applications
of ontology: Computer applications (p. 115–130).
Springer.
Vecco, M. (2010). A definition of cultural heritage : From
the tangible to the intangible. Journal of Cultural
Heritage, 11(3), 321–324.
Vrandečić, D. (2009). Ontology evaluation. In Handbook
on ontologies (p. 293–313). Springer.
Zhang, H., Li, Y.-F., & Tan, H. B. K. (2010). Measuring
design complexity of semantic web ontologies. Journal
of Systems and Software, 83(5), 803–814.
Zhe, Y., Zhang, D., & Chuan, Y. E. (2006). Evaluation
metrics for ontology complexity and evolution analysis.
2006 IEEE International Conference on e-Business
Engineering (ICEBE’06), 162–170.
KEOD 2020 - 12th International Conference on Knowledge Engineering and Ontology Development
140