# COMPARISON OF GREEDY ALGORITHMS FOR DECISION TREE CONSTRUCTION

### Abdulaziz Alkhalid, Igor Chikalov, Mikhail Moshkov

#### Abstract

The paper compares different heuristics that are used by greedy algorithms for constructing of decision trees. Exact learning problem with all discrete attributes is considered that assumes absence of contradictions in the decision table. Reference decision tables are based on 24 data sets from UCI Machine Learning Repository (Frank and Asuncion, 2010). Complexity of decision trees is estimated relative to several cost functions: depth, average depth, and number of nodes. Costs of trees built by greedy algorithms are compared with exact minimums calculated by an algorithm based on dynamic programming. The results associate to each cost function a set of potentially good heuristics that minimize it.

#### References

- Ahlswede, R. and Wegener, I. (1979). Suchprobleme. Teubner Verlag, Stuttgart.
- Alekhnovich, M., Braverman, M., Feldman, V., Klivans, A. R., and Pitassi, T. (2004). Learnability and automatizability. In Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, pages 621-630, Washington, DC, USA. IEEE Computer Society.
- Breiman, L. et al. (1984). Classification and Regression Trees. Chapman & Hall, New York.
- Chakaravarthy, V. T., Pandit, V., Roy, S., Awasthi, P., and Mohania, M. (2007). Decision trees for entity identification: approximation algorithms and hardness results. In Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, PODS 7807, pages 53-62, New York, NY, USA. ACM.
- Frank, A. and Asuncion, A. (2010). UCI machine learning repository.
- Garey, M. R. (1972). Optimal binary identification procedures. SIAM Journal on Applied Mathematics, 23(2):173-186.
- Heeringa, B. and Adler, M. (2005). Approximating optimal binary decision trees. Technical Report 05-25, University of Massachusetts, Amherst.
- Hyafil, L. and Rivest, R. (1976). Constructing optimal binary decision trees is np-complete. Information Processing Letters, 5:15-17.
- Martelli, A. and Montanari, U. (1978). Optimizing decision trees through heuristically guided search. Commun. ACM, 21:1025-1039.
- Moret, B. E., , and R. C. Gonzalez, M. T. (1980). The activity of a variable and its relation to decision trees. ACM Trans. Program. Lang. Syst., 2:580-595.
- Moshkov, M. J. (2005). Time complexity of decision trees. T. Rough Sets, 3400:244-459.
- Moshkov, M. J. (2010). Greedy algorithm with weights for decision tree construction. Fundam. Inform., 104(3):285-292.
- Moshkov, M. J. and Chikalov, I. V. (2003). Consecutive optimization of decision trees concerning various complexity measures. Fundamenta Informaticae, 61(2):87-96.
- Pattipati, K. R. and Dontamsetty, M. (1992). On a generalized test sequencing problem. IEEE Transactions on Systems, Man, and Cybernetics, 22(2):392-396.
- Quinlan, J. R. (1986). Induction of decision trees. Mach. Learn., 1:81-106.
- Quinlan, J. R. (1993). C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning). Morgan Kaufmann, 1 edition.

#### Paper Citation

#### in Harvard Style

Alkhalid A., Chikalov I. and Moshkov M. (2011). **COMPARISON OF GREEDY ALGORITHMS FOR DECISION TREE CONSTRUCTION** . In *Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)* ISBN 978-989-8425-79-9, pages 430-435. DOI: 10.5220/0003654904380443

#### in Bibtex Style

@conference{kdir11,

author={Abdulaziz Alkhalid and Igor Chikalov and Mikhail Moshkov},

title={COMPARISON OF GREEDY ALGORITHMS FOR DECISION TREE CONSTRUCTION},

booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},

year={2011},

pages={430-435},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0003654904380443},

isbn={978-989-8425-79-9},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)

TI - COMPARISON OF GREEDY ALGORITHMS FOR DECISION TREE CONSTRUCTION

SN - 978-989-8425-79-9

AU - Alkhalid A.

AU - Chikalov I.

AU - Moshkov M.

PY - 2011

SP - 430

EP - 435

DO - 10.5220/0003654904380443