Authors:
Imadeddine Mountasser
1
;
Brahim Ouhbi
1
and
Bouchra Frikh
2
Affiliations:
1
ENSAM and Moulay Ismaïl University, Morocco
;
2
ESTF and Sidi Mohamed Ben Abdellah University, Morocco
Keyword(s):
Knowledge-based Systems, Big Data Integration, Parallel Large-Scale Ontology Partitioning, Markov Clustering, Distributed Architecture.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Data Engineering
;
Enterprise Information Systems
;
Information Systems Analysis and Specification
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Ontologies and the Semantic Web
;
Ontology Engineering
;
Ontology Matching and Alignment
;
Symbolic Systems
Abstract:
Actually, huge amounts of data are generated at distributed heterogeneous sources, to create and to share information on several domains. Thus, data scientists need to develop appropriate and efficient management strategies to cope with the heterogeneity and the interoperability issues of data sources. In fact, ontology as schema-less graph model and ontology matching as dynamic real-time large-scale data integration enabler are addressed to design and develop advanced management mechanisms. However, given the large-scale context, we adopt ontology partitioning strategies, which split ontologies into a set of disjoint partitions, as a crucial part to reduce the computational complexity and to improve the performance of the ontology matching process. To this end, this paper proposes a novel approach for large-scale ontology partitioning through parallel Markov-based clustering strategy using Spark framework. This latter offers the ability to run in-memory computations to provide faste
r and expressive partitioning and to increase the speed of the matching system. The results drawn by our strategy over real-world ontologies demonstrate significant performance which makes it suitable to be incorporated in our large-scale ontology matching system.
(More)