Applying Ensemble-based Online Learning Techniques on Crime Forecasting

Anderson José de Souza, André Pinz Borges, Heitor Murilo Gomes, Jean Paul Barddal, Fabrício Enembreck

2015

Abstract

Traditional prediction algorithms assume that the underlying concept is stationary, i.e., no changes are expected to happen during the deployment of an algorithm that would render it obsolete. Although, for many real world scenarios changes in the data distribution, namely concept drifts, are expected to occur due to variations in the hidden context, e.g., new government regulations, climatic changes, or adversary adaptation. In this paper, we analyze the problem of predicting the most susceptible types of victims of crimes occurred in a large city of Brazil. It is expected that criminals change their victims’ types to counter police methods and vice-versa. Therefore, the challenge is to obtain a model capable of adapting rapidly to the current preferred criminal victims, such that police resources can be allocated accordingly. In this type of problem the most appropriate learning models are provided by data stream mining, since the learning algorithms from this domain assume that concept drifts may occur over time, and are ready to adapt to them. In this paper we apply ensemble-based data stream methods, since they provide good accuracy and the ability to adapt to concept drifts. Results show that the application of these ensemble-based algorithms (Leveraging Bagging, SFNClassifier, ADWIN Bagging and Online Bagging) reach feasible accuracy for this task.

References

  1. Albert, R. and Barabási, A. L. (2002). Statistical mechanics of complex networks. In Reviews of Modern Physics, pages 139-148. The American Physical Society.
  2. Azevedo, A. L. V. d., Riccio, V., and Ruediger, M. A. (2011). A utilização das estatísticas criminais no planejamento da ação policial: Cultura e contexto organizacional como elementos centrais. 40:9 - 21.
  3. Barddal, J. P., Gomes, H. M., and Enembreck, F. (2014). Sfnclassifier: A scale-free social network method to handle concept drift. In Proceedings of the 29th Annual ACM Symposium on Applied Computing (SAC), SAC 2014. ACM.
  4. Bessa, R., Miranda, V., and Gama, J. (2009). Entropy and correntropy against minimum square error in offline and online three-day ahead wind power forecasting.
  5. Power Systems, IEEE Transactions on, 24(4):1657- 1666.
  6. Bifet, A., Holmes, G., Kirkby, R., and Pfahringer, B. (2010a). Moa: Massive online analysis. The Journal of Machine Learning Research, 11:1601-1604.
  7. Bifet, A., Holmes, G., and Pfahringer, B. (2010b). Leveraging bagging for evolving data streams. In Machine Learning and Knowledge Discovery in Databases, pages 135-150. ECML PKDD.
  8. Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., and Gavaldà, R. (2009). New ensemble methods for evolving data streams. In Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 139-148. ACM SIGKDD.
  9. Brzezinski, D. and Stefanowski, J. (2013). Reacting to different types of concept drift: The accuracy updated ensemble algorithm. IEEE Transactions on Neural Networks and Learning Systems, 25(1):81-94.
  10. Domingos, P. and Hulten, G. (2000). Mining high-speed data streams. In Proc. of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71-80. ACM SIGKDD.
  11. Ferreira, J., J. P. M. J. (2012). Gis for crime analysis: Geography for predictive models. 15:36-49.
  12. Gama, J. and Rodrigues, P. (2009). Issues in evaluation of stream learning algorithms. In Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 329-338. ACM SIGKDD.
  13. Gama, J. a., Žliobaite?, I., Bifet, A., Pechenizkiy, M., and Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Comput. Surv., 46(4):44:1-44:37.
  14. Gomes, H. M. and Enembreck, F. (2013). Sae: Social adaptive ensemble classifier for data streams. In Computational Intelligence and Data Mining (CIDM), 2013 IEEE Symposium on, pages 199-206.
  15. Gomes, H. M. and Enembreck, F. (2014). Sae2: Advances on the social adaptive ensemble classifier for data streams. In Proceedings of the 29th Annual ACM Symposium on Applied Computing (SAC), SAC 2014. ACM.
  16. John, G. H. and Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338-345, San Mateo. Morgan Kaufmann.
  17. Kolter, J. Z. and Maloof, M. A. (2005). Using additive expert ensembles to cope with concept drift. In ICML 7805 Proc. of the 22nd international conference on Machine learning, pages 449-456. ACM.
  18. Kolter, J. Z. and Maloof, M. A. (2007). Dynamic weighted majority: An ensemble method for drifting concepts. In The Journal of Machine Learning Research, pages 123-130. JMLR.
  19. Kuncheva, L. I. (2004). Combining Pattern Classifiers: Methods and Algorithms. John Wiley and Sons, New Jersey.
  20. Kuncheva, L. I., Whitaker, C. J., Shipp, C. A., and Duin, R. P. (2003). Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications, 6(1):22-31.
  21. Machado, D. M. S. (2009). O uso da informação na gestão inteligente da segurança pública. In A força policial: órgão de informação e doutrina da instituição policial militar, pages 77-85.
  22. Nath, S. V. (2006). Crime pattern detection using data mining. In Web Intelligence and Intelligent Agent Technology Workshops, 2006. WI-IAT 2006 Workshops. 2006 IEEE/WIC/ACM International Conference on, pages 41-44.
  23. Oza, N. C. and Russell, S. (2001). Online bagging and boosting. In Artificial Intelligence and Statistics, pages 105-112. Society for Artificial Intelligence and Statistics.
  24. Pechenizkiy, M., Bakker, J., Žliobaite?, I., Ivannikov, A., and Kärkkäinen, T. (2010). Online mass flow prediction in cfb boilers with explicit detection of sudden concept drift. SIGKDD Explor. Newsl., 11(2):109-116.
  25. Poelmans, J., Elzinga, P., Viaene, S., and Dedene, G. (2011). Formally analysing the concepts of domestic violence. Expert Systems with Applications, 38(4):3116 - 3130.
  26. Schlimmer, J. C. and Granger, R. H. (1986). Incremental learning from noisy data. Machine Learning, 1(3):317-354.
  27. Siklóssy, L. and Ayel, M. (1997). Datum discovery. In Advances in Intelligent Data Analysis, Reasoning about Data, Second International Symposium, IDA-97, London, UK, August 4-6, 1997, Proceedings, pages 459- 463.
  28. Tsymbal, A. (2004). The problem of concept drift: definitions and related work. Technical Report TCD-CS2004-15, The University of Dublin, Trinity College, Department of Computer Science, Dublin, Ireland.
  29. Wang, T., Rudin, C., Wagner, D., and Sevieri, R. (2013). Learning to detect patterns of crime. In Machine Learning and Knowledge Discovery in Databases, volume 8190 of Lecture Notes in Computer Science, pages 515-530. Springer Berlin Heidelberg.
  30. Widmer, G. and Kubate, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine Learning, 23(1):69-101.
  31. Zliobaite, I. (2010). Learning under concept drift: an overview. CoRR, abs/1010.4784.
Download


Paper Citation


in Harvard Style

de Souza A., Pinz Borges A., Gomes H., Barddal J. and Enembreck F. (2015). Applying Ensemble-based Online Learning Techniques on Crime Forecasting . In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-096-3, pages 17-24. DOI: 10.5220/0005335700170024


in Bibtex Style

@conference{iceis15,
author={Anderson José de Souza and André Pinz Borges and Heitor Murilo Gomes and Jean Paul Barddal and Fabrício Enembreck},
title={Applying Ensemble-based Online Learning Techniques on Crime Forecasting},
booktitle={Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2015},
pages={17-24},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005335700170024},
isbn={978-989-758-096-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Applying Ensemble-based Online Learning Techniques on Crime Forecasting
SN - 978-989-758-096-3
AU - de Souza A.
AU - Pinz Borges A.
AU - Gomes H.
AU - Barddal J.
AU - Enembreck F.
PY - 2015
SP - 17
EP - 24
DO - 10.5220/0005335700170024