An Automated and Distributed Machine Learning Framework for Telecommunications Risk Management

Luís Ferreira, André Pilastri, Carlos Martins, Pedro Santos, Paulo Cortez

Abstract

Automation and scalability are currently two of the main challenges of Machine Learning. This paper proposes an automated and distributed ML framework that automatically trains a supervised learning model and produces predictions independently of the dataset and with minimum human input. The framework was designed for the domain of telecommunications risk management, which often requires supervised learning models that need to be quickly updated by non-ML-experts and trained on vast amounts of data. Thus, the architecture assumes a distributed environment, in order to deal with big data, and Automated Machine Learning (AutoML), to select and tune the ML models. The framework includes several modules: task detection (to detect if classification or regression), data preprocessing, feature selection, model training, and deployment. In this paper, we detail the model training module. In order to select the computational technologies to be used in this module, we first analyzed the capabilities of an initial set of five modern AutoML tools: Auto-Keras, Auto-Sklearn, Auto-Weka, H2O AutoML, and TransmogrifAI. Then, we performed a benchmarking of the only two tools that address distributed ML (H2O AutoML and TransmogrifAI). Several comparison experiments were held using three real-world datasets from the telecommunications domain (churn, event forecasting, and fraud detection), allowing us to measure the computational effort and predictive capability of the AutoML tools.

Download


Paper Citation


in Harvard Style

Ferreira L., Pilastri A., Martins C., Santos P. and Cortez P. (2020). An Automated and Distributed Machine Learning Framework for Telecommunications Risk Management.In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-395-7, pages 99-107. DOI: 10.5220/0008952800990107


in Bibtex Style

@conference{icaart20,
author={Luís Ferreira and André Pilastri and Carlos Martins and Pedro Santos and Paulo Cortez},
title={An Automated and Distributed Machine Learning Framework for Telecommunications Risk Management},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2020},
pages={99-107},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008952800990107},
isbn={978-989-758-395-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - An Automated and Distributed Machine Learning Framework for Telecommunications Risk Management
SN - 978-989-758-395-7
AU - Ferreira L.
AU - Pilastri A.
AU - Martins C.
AU - Santos P.
AU - Cortez P.
PY - 2020
SP - 99
EP - 107
DO - 10.5220/0008952800990107