Machine Learning within a Graph Database: A Case Study on Link Prediction for Scholarly Data

Sepideh Sobhgol, Gabriel Durand, Lutz Rauchhaupt, Gunter Saake

2021

Abstract

In the combination of data management and ML tools, a common problem is that ML frameworks might require moving the data outside of their traditional storage (i.e. databases), for model building. In such scenarios, it could be more effective to adopt some in-database statistical functionalities (Cohen et al., 2009). Such functionalities have received attention for relational databases, but unfortunately for graph-based database systems there are insufficient studies to guide users, either by clarifying the roles of the database or the pain points that require attention. In this paper we make an early feasibility consideration of such processing for a graph domain, prototyping on a state-of-the-art graph database (Neo4j) an in-database ML-driven case study on link prediction. We identify a general series of steps and a common-sense approach for database support. We find limited differences in most steps for the processing setups, suggesting a need for further evaluation. We identify bulk feature calculation as the most time consuming task, at both the model building and inference stages, and hence we define it as a focus area for improving how graph databases support ML workloads.

Download


Paper Citation


in Harvard Style

Sobhgol S., Durand G., Rauchhaupt L. and Saake G. (2021). Machine Learning within a Graph Database: A Case Study on Link Prediction for Scholarly Data. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-509-8, pages 159-166. DOI: 10.5220/0010381901590166


in Bibtex Style

@conference{iceis21,
author={Sepideh Sobhgol and Gabriel Durand and Lutz Rauchhaupt and Gunter Saake},
title={Machine Learning within a Graph Database: A Case Study on Link Prediction for Scholarly Data},
booktitle={Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2021},
pages={159-166},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010381901590166},
isbn={978-989-758-509-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Machine Learning within a Graph Database: A Case Study on Link Prediction for Scholarly Data
SN - 978-989-758-509-8
AU - Sobhgol S.
AU - Durand G.
AU - Rauchhaupt L.
AU - Saake G.
PY - 2021
SP - 159
EP - 166
DO - 10.5220/0010381901590166