Authors:
Sophie Bekerman
1
;
Eric Chen
1
;
Lily Lin
2
and
George D. Monta Nez
1
Affiliations:
1
AMISTAD Lab, Dept. of Computer Science, Harvey Mudd College, Claremont, CA, U.S.A.
;
2
Department of Math and Computer Science, Biola University, La Mirada, CA, U.S.A.
Keyword(s):
Inductive Bias, Algorithmic Bias, Vectorization, Algorithmic Search Framework.
Abstract:
We develop a method to measure and compare the inductive bias of classifications algorithms by vectorizing aspects of their behavior. We compute a vectorized representation of the algorithm’s bias, known as the inductive orientation vector, for a set of algorithms. This vector captures the algorithm’s probability distribution over all possible hypotheses for a classification task. We cluster and plot the algorithms’ inductive orientation vectors to visually characterize their relationships. As algorithm behavior is influenced by the training dataset, we construct a Benchmark Data Suite (BDS) matrix that considers algorithms’ pairwise distances across many datasets, allowing for more robust comparisons. We identify many relationships supported by existing literature, such as those between k-Nearest Neighbor and Random Forests and among tree-based algorithms, and evaluate the strength of those known connections, showing the potential of this geometric approach to investigate black-box
machine learning algorithms.
(More)