Distributed Optimization of Classifier Committee Hyperparameters

Sanzhar Aubakirov, Paulo Trigo, Darhan Ahmed-Zaki

2018

Abstract

In this paper, we propose an optimization workflow to predict classifiers accuracy based on the exploration of the space composed of different data features and the configurations of the classification algorithms. The overall process is described considering the text classification problem. We take three main features that affect text classification and therefore the accuracy of classifiers. The first feature considers the words that comprise the inputtext; here we use the N-gram concept with different N values. The second feature considers the adoption of textual pre-processing steps such as the stop-word filtering and stemming techniques. The third feature considers the classification algorithms hyperparameters. In this paper, we take the well-known classifiers K-Nearest Neighbors (KNN) and Naive Bayes (NB) where K (from KNN) and a-priori probabilities (from NB) are hyperparameters that influence accuracy. As a result, we explore the feature space (correlation among textual and classifier aspects) and we present an approximation model that is able to predict classifiers accuracy.

Download


Paper Citation


in Harvard Style

Aubakirov S., Trigo P. and Ahmed-Zaki D. (2018). Distributed Optimization of Classifier Committee Hyperparameters.In Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-318-6, pages 171-179. DOI: 10.5220/0006884101710179


in Bibtex Style

@conference{data18,
author={Sanzhar Aubakirov and Paulo Trigo and Darhan Ahmed-Zaki},
title={Distributed Optimization of Classifier Committee Hyperparameters},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2018},
pages={171-179},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006884101710179},
isbn={978-989-758-318-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Distributed Optimization of Classifier Committee Hyperparameters
SN - 978-989-758-318-6
AU - Aubakirov S.
AU - Trigo P.
AU - Ahmed-Zaki D.
PY - 2018
SP - 171
EP - 179
DO - 10.5220/0006884101710179