An Analysis of the Impact of Diversity on Stacking Supervised Classifiers

Mariele Lanes, Paula F. Schiavo, Sidnei F. Pereira Jr., Eduardo N. Borges, Renata Galante

2017

Abstract

Due to the growth of research in pattern recognition area, the limits of the techniques used for the classification task are increasingly tested. Thus, it is clear that specialized and properly configured classifiers are quite effective. However, it is not a trivial task to choose the most appropriate classifier for deal with a particular problem and set it up properly. In addition, there is no optimal algorithm to solve all prediction problems. Thus, in order to improve the result of the classification process, some techniques combine the knowledge acquired by individual learning algorithms aiming to discover new patterns not yet identified. Among these techniques, there is the stacking strategy. This strategy consists in the combination of outputs of base classifiers, induced by several learning algorithms using the same dataset, by means of another classifier called meta-classifier. This paper aims to verify the relation between the classifiers diversity and the quality of stacking. We have performed a lot of experiments which results show the impact of multiple diversity measures on the gain of stacking.

References

  1. Afifi, A. A. and Azen, S. P. (2014). Statistical analysis: a computer oriented approach. Academic press, New York.
  2. Ali, S. and Majid, A. (2015). Can-evo-ens. Journal of Biomedical Informatics, 54(C):256-269.
  3. Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the Annual Workshop on Computational Learning Theory, pages 144-152. ACM.
  4. Breiman, L. (1996). Stacked regressions. Machine Learning, 24(1):49-64.
  5. Breiman, L. (2001). Random forests. Machine Learning, 45(1):5-32.
  6. Cohen, W. W. (1995). Fast effective rule induction. In Proceedings of the International Conference on Machine Learning, pages 115-123.
  7. Cunningham, P. and Carney, J. (2000). Diversity versus quality in classification ensembles based on feature selection. In European Conference on Machine Learning, pages 109-116, Berlin, Heidelberg. Springer Berlin Heidelberg.
  8. Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139-157.
  9. Dzeroski, S. and Zenko, B. (2004). Is combining classifiers with stacking better than selecting the best one? Machine Learning, 54(3):255-273.
  10. Ebrahimpour, R., Sadeghnejad, N., Amiri, A., and Moshtagh, A. (2010). Low resolution face recognition using combination of diverse classifiers. In Proceedings of the International Conference of Soft Computing and Pattern Recognition, pages 265-268. IEEE.
  11. García-Guti érrez, J., Mateos-García, D., and RiquelmeSantos, J. (2012). Evor-stack: A label-dependent evolutive stacking on remote sensing data fusion. Neurocomputing, 75(1):115-122.
  12. Giacinto, G. and Roli, F. (2001). Design of effective neural network ensembles for image classification purposes. Image and Vision Computing, 19(910):699-707.
  13. Haykin, S. (2007). Neural Networks: A Comprehensive Foundation. Prentice-Hall, Inc., Upper Saddle River, USA.
  14. Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence, 20(8):832- 844.
  15. John, G. H. and Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, pages 338-345. Morgan Kaufmann Publishers Inc.
  16. Johnson, R. A. and Wichern, D. W. (2002). Applied multivariate statistical analysis. Prentice hall Englewood Cliffs.
  17. Kohavi, R., Wolpert, D. H., et al. (1996). Bias plus variance decomposition for zero-one loss functions. In International Conference on Machine Learning, pages 275- 83.
  18. Kotsiantis, S. B. and Pintelas, P. E. (2004). A hybrid decision support tool-using ensemble of classifiers. In Proceedings of the International Conference On Enterprise Information Systems, pages 448-453.
  19. Kuncheva, L. I. and Whitaker, C. J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2):181-207.
  20. Larios, N., Lin, J., Zhang, M., Lytle, D., Moldenke, A., Shapiro, L., and Dietterich, T. (2011). Stacked spatialpyramid kernel: An object-class recognition method to combine scores from random trees. In Proceedings of the IEEE Workshop on Applications of Computer Vision, pages 329-335. IEEE.
  21. Merz, C. J. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36(1-2):33-58.
  22. Ness, S. R., Theocharis, A., Tzanetakis, G., and Martins, L. G. (2009). Improving automatic music tag annotation using stacked generalization of probabilistic svm outputs. In Proceedings of the ACM International Conference on Multimedia, pages 705-708. ACM.
  23. Opitz, D. and Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11:169-198.
  24. Platt, J. C. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Proceedings of the Advances in Large Margin Classifiers. MIT Press.
  25. Quinlan, J. R. (1992). Learning with continuous classes. In Proceedings of the Australian Joint Conference on Artificial Intelligence, volume 92, pages 343-348, Singapore. World Scientific.
  26. Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, USA.
  27. Sneath, P. H. A. and Sokal, R. R. (1973). Numerical taxonomy. The principles and practice of numerical classification. W.H. Freeman and Company, San Francisco, USA.
  28. Tan, P.-N., Steinbach, M., and Kumar, V. (2005). Introduction to Data Mining. Addison-Wesley.
  29. Ting, K. M. and Witten, I. H. (1999). Issues in stacked generalization. Journal of Artificial Intelligence Research, 10:271-289.
  30. Witten, I. H. and Frank, E. (2011). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
  31. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2):241-259.
Download


Paper Citation


in Harvard Style

Lanes M., F. Schiavo P., F. Pereira Jr. S., N. Borges E. and Galante R. (2017). An Analysis of the Impact of Diversity on Stacking Supervised Classifiers . In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-247-9, pages 233-240. DOI: 10.5220/0006291202330240


in Bibtex Style

@conference{iceis17,
author={Mariele Lanes and Paula F. Schiavo and Sidnei F. Pereira Jr. and Eduardo N. Borges and Renata Galante},
title={An Analysis of the Impact of Diversity on Stacking Supervised Classifiers},
booktitle={Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2017},
pages={233-240},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006291202330240},
isbn={978-989-758-247-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - An Analysis of the Impact of Diversity on Stacking Supervised Classifiers
SN - 978-989-758-247-9
AU - Lanes M.
AU - F. Schiavo P.
AU - F. Pereira Jr. S.
AU - N. Borges E.
AU - Galante R.
PY - 2017
SP - 233
EP - 240
DO - 10.5220/0006291202330240