Authors:
Giuliano Armano
and
Emanuele Tamponi
Affiliation:
University of Cagliari, Italy
Keyword(s):
Supervised Learning, Correlation, Metrics, Performance, Encoding Techniques, Classification, Prediction.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Bioinformatics and Systems Biology
;
Classification
;
Feature Selection and Extraction
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
The performance of a classification system depends on various aspects, including encoding techniques. In fact, encoding techniques play a primary role in the process of tuning a classifier/predictor, as choosing the most appropriate encoder may greatly affect its performance. As of now, evaluating the impact of an encoding technique on a classification system typically requires to train the system and test it by means of a performance metric deemed relevant (e.g., precision, recall, and Matthews correlation coefficients). For this reason, assessing
a single encoding technique is a time consuming activity, which introduces some additional degrees of freedom (e.g., parameters of the training algorithm) that may be uncorrelated with the encoding technique to be assessed. In this paper, we propose a family of methods to measure the performance of encoding techniques used in classification tasks, based on the correlation between encoded input data and the corrisponding output. The propose
d approach provides correlation-based metrics, devised with the primary goal of focusing on the
encoding technique, leading other unrelated aspects apart. Notably, the proposed technique allows to save computational time to a great extent, as it needs only a tiny fraction of the time required by standard methods.
(More)