The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees

Jonathan Bodine, Dorit S. Hochbaum

2020

Abstract

Decision trees are a widely used method for classification, both alone and as the building blocks of multiple different ensemble learning methods. The Max-Cut decision tree involves novel modifications to a standard, baseline model of classification decision tree, precisely CART Gini. One modification involves an alternative splitting metric, Maximum Cut, which is based on maximizing the distance between all pairs of observations that belong to separate classes and separate sides of the threshold value. The other modification is to select the decision feature from a linear combination of the input features constructed using Principal Component Analysis (PCA) locally at each node. Our experiments show that this node-based, localized PCA with the novel splitting modification can dramatically improve classification, while also significantly decreasing computational time compared to the baseline decision tree. Moreover, our results are most significant when evaluated on data sets with higher dimensions, or more classes. For the example data set CIFAR-100, the modifications enabled a 49% improvement in accuracy, relative to CART Gini, while reducing CPU time by 94% for comparable implementations. These introduced modifications will dramatically advance the capabilities of decision trees for difficult classification tasks.

Download


Paper Citation


in Harvard Style

Bodine J. and Hochbaum D. (2020). The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees. In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - Volume 1: KDIR; ISBN 978-989-758-474-9, SciTePress, pages 59-70. DOI: 10.5220/0010107400590070


in Bibtex Style

@conference{kdir20,
author={Jonathan Bodine and Dorit S. Hochbaum},
title={The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees},
booktitle={Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - Volume 1: KDIR},
year={2020},
pages={59-70},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010107400590070},
isbn={978-989-758-474-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020) - Volume 1: KDIR
TI - The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees
SN - 978-989-758-474-9
AU - Bodine J.
AU - Hochbaum D.
PY - 2020
SP - 59
EP - 70
DO - 10.5220/0010107400590070
PB - SciTePress