Adabook and Multibook - Adaptive Boosting with Chance Correction

David M. W. Powers

2013

Abstract

There has been considerable interest in boosting and bagging, including the combination of the adaptive techniques of AdaBoost with the random selection with replacement techniques of Bagging. At the same time there has been a revisiting of the way we evaluate, with chance-corrected measures like Kappa, Informedness, Correlation or ROC AUC being advocated. This leads to the question of whether learning algorithms can do better by optimizing an appropriate chance corrected measure. Indeed, it is possible for a weak learner to optimize Accuracy to the detriment of the more reaslistic chance-corrected measures, and when this happens the booster can give up too early. This phenomenon is known to occur with conventional Accuracy-based AdaBoost, and the MultiBoost algorithm has been developed to overcome such problems using restart techniques based on bagging. This paper thus complements the theoretical work showing the necessity of using chance-corrected measures for evaluation, with empirical work showing how use of a chance-corrected measure can improve boosting. We show that the early surrender problem occurs in MultiBoost too, in multiclass situations, so that chance-corrected AdaBook and Multibook can beat standard Multiboost or AdaBoost, and we further identify which chance-corrected measures to use when.

References

  1. Atyabi, Adham, Luerssen, Martin H. & Powers, David M. W. (2013), PSO-Based Dimension Reduction of EEG Recordings: Implications for Subject Transfer in BCI, Neurocomputing.
  2. Cumming, G. (2012). Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. New York: Routledge
  3. Entwisle, Jim & Powers, David MW (1998). The present use of statistics in the evaluation of NLP parsers, Joint Conferences on New Methods in Language Processing & Computational Natural Language Learning, 215-224.
  4. Fitzgibbon S. P., Lewis T. W., Powers D. M. W., Whitham EM, Willoughby JO and Pope KJ (2013), Surface Laplacian of central scalp electrical signals is insensitive to muscle contamination, IEEE Transactions on Biomedical Eng.
  5. Fitzgibbon, S. P., Powers D. M. W., Pope K. J. & Clark C. R. (2007). Removal of EEG noise and artifact using blind source separation, Journal of Clinical Neurophysiology 24 (3), 232-243
  6. Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256-285
  7. Freund, Y. & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139
  8. Huang, J. H. & Powers D. M. W. (2001). Large scale experiments on correction of confused words, Australian Computer Science Communications 23:77-82
  9. Jia, Xibin, Han, Yanfang, Powers, D. and Bao, Xiyuan (2012). Spatial and temporal visual speech feature for Chinese phonemes. Journal of Information & Computational Science 9(14):4177-4185.
  10. Jia Xibin, Bao Xiyuan, David M W. Powers, Li Yujian (2013), Facial expression recognition based on block Gabor wavelet fusion feature, Journal of Information and Computational Science.
  11. Kearns, M. and Valiant, L. G. (1989). Crytographic limitations on learning Boolean formulae and finaite automata. Proceedings of the 21st ACM Symposium on Theory of Computing (pp.433-444). New York NY: ACM Press
  12. Lewis, Trent W. & Powers, David M. W. (2004). Sensor fusion weighting measures in audio-visual speech recognition, 27th Australasian Conference on Computer Science 26:305-314.
  13. Long, Philip M. & Servedio, Rocco A. (2005). Martingale Boosting. Learning Theory/COLT 40-57.
  14. Long, Philip M. & Servedio, Rocco A. (2008). Adapative Martingale Boosting. Neural Information Processing Systems (NIPS).
  15. Long, Philip M. & Servedio, Rocco A. (2010). Random Classification Noise defeats all Convex Potential Boosters. Machine Learning 78:287-304
  16. Powers David M. W. (1983), Neurolinguistics and Psycholinguistics as a Basis for Computer Acquisition of Natural Language, SIGART 84:29-34
  17. Powers David M. W. (1991), How far can self-organization go? Results in unsupervised language learning, AAAI Spring Symposium on Machine Learning of Natural Language & Ontology:131-136
  18. Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technology, 2(1), 37-63
  19. Powers, D. M. W. (2012). The Problem with Kappa. European Meeting of the Association for Computational Linguistics, 345-355
  20. Powers, D. M. W. (2003). Recall and Precision versus the Bookmaker, International Conference on Cognitive Science, 529-534
  21. Schapire, R. E., & Freund, Y. (2012). Boosting, MIT Press, Cambridge MA
  22. Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37, 297-336
  23. Schapire, R. E. (1990). The strength of weak learnability. Machine Learning 5:197-227
  24. Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11):1134-1142
  25. Viola, Paul & Jones, Michael (2001). Rapid Object Detection using a Boosted Cascade of Simple Features, Conference on Computer Vision and Pattern Recognition.
  26. Webb, Geoffrey I. (2000) MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning, 40, 159-39
  27. Witten, I. H., Frank, E., & Hall, M., (2011). Data Mining: Practical Machine Learning Tools and Techniques. 3rd Edn Amsterdam: Morgan Kauffman
  28. Yang, D. Q. & Powers, D. M. W. (2006). Verb similarity on the taxonomy of WordNet, Global Wordnet Conference 2006, 121-128
  29. Zhu, J., Zou, H., Rosset, S., and Hastie, T. (2009). Multiclass AdaBoost. Statistics and its Inference 2:349-360
Download


Paper Citation


in Harvard Style

M. W. Powers D. (2013). Adabook and Multibook - Adaptive Boosting with Chance Correction . In Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-989-8565-70-9, pages 349-359. DOI: 10.5220/0004416303490359


in Bibtex Style

@conference{icinco13,
author={David M. W. Powers},
title={Adabook and Multibook - Adaptive Boosting with Chance Correction},
booktitle={Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2013},
pages={349-359},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004416303490359},
isbn={978-989-8565-70-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - Adabook and Multibook - Adaptive Boosting with Chance Correction
SN - 978-989-8565-70-9
AU - M. W. Powers D.
PY - 2013
SP - 349
EP - 359
DO - 10.5220/0004416303490359