RDF Data Clustering based on Resource and Predicate Embeddings

Siham Eddamiri, El Moukhtar Zemmouri, Asmaa Benghabrit

2018

Abstract

With the increasing amount of Linked Data on the Web in the past decade, there is a growing desire for machine learning community to bring this type of data into the fold. However, while Linked Data and Machine Learning have seen an explosive growth in popularity, relatively little attention has been paid in the literature to the possible union of both Linked Data and Machine Learning. The best way to collaborate these two fields is to focus on RDF data. After a thorough overview of Machine learning pipeline on RDF data, the paper presents an unsupervised feature extraction technique named Walks and two language modeling approaches, namely Word2vec and Doc2vec. In order to adapt the RDF graph to the clustering mechanism, we first applied the Walks technique on several sequences of entities by combining it with the Word2Vec approach. However, the application of the Doc2vec approach to a set of walks gives better results on two different datasets.

Download


Paper Citation


in Harvard Style

Eddamiri S., Zemmouri E. and Benghabrit A. (2018). RDF Data Clustering based on Resource and Predicate Embeddings. In Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR; ISBN 978-989-758-330-8, SciTePress, pages 367-373. DOI: 10.5220/0007228903670373


in Bibtex Style

@conference{kdir18,
author={Siham Eddamiri and El Moukhtar Zemmouri and Asmaa Benghabrit},
title={RDF Data Clustering based on Resource and Predicate Embeddings},
booktitle={Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR},
year={2018},
pages={367-373},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007228903670373},
isbn={978-989-758-330-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR
TI - RDF Data Clustering based on Resource and Predicate Embeddings
SN - 978-989-758-330-8
AU - Eddamiri S.
AU - Zemmouri E.
AU - Benghabrit A.
PY - 2018
SP - 367
EP - 373
DO - 10.5220/0007228903670373
PB - SciTePress