Analyzing Distributions of Emails and Commits from OSS Contributors through Mining Software Repositories - An Exploratory Study

Mário Farias, Renato Novais, Paulo Ortins, Methanias Colaço, Manoel Mendonça

2015

Abstract

Context: Distributed software development is a modern practice in software industry. This is especially true in Open Source Software (OSS) community. In this context, developers are normally distributed around the world. In addition, most of them work for free and without or with low coordinating. Understanding how developers’ practices are on those projects may guide communities to successfully manage their projects. Goal: We mined two repositories of the Apache Httpd project in order to gather information about its developers’ behavior. Method: We developed an approach to cross data gathered from mail list and source code repository through mining techniques. The approach uses software visualization to analyze the mined data. We conducted an experimental evaluation of the approach to assess the behavioral patterns from OSS development community. Results: Our results show Apache developers’ behavior patterns. In addition, we deepen the analysis of the Preferred Representational System of four top developers presented by Colaço et. al in (Colaço et al., 2010). Conclusion: The use of data mining and software visualization to analyze data from different sources can spot important properties of development processes.

References

  1. Canfora, G., Cerulo, L., Cimitile, M., and Di Penta, M. (2011). Social interactions around cross-system bug fixings: The case of freebsd and openbsd. In MSR, pages 143-152.
  2. Colac¸o, M., Mendonc¸a, M., @and Paulo Henrique, M. F., and Corumba, D. (2012). A neurolinguistic method for identifying oss developers' context-specific preferred representational systems. page 112 to 121.
  3. Colac¸o, M., Mendonca, M., Farias, M., and Henrique, P. (2010). Oss developers context-specific preferred representational systems: A initial neurolinguistic text analysis of the apache mailing list. MSR, pages 126- 129.
  4. D'Ambros, M., Lanza, M., and Robbes, R. (2010). Commit 2.0. In WW2SE, pages 14-19. ACM.
  5. Eyolfson, J., Tan, L., and Lam, P. (2011). Do time of day and developer experience affect commit bugginess? In Proceedings of the 8th Working Conference on Mining Software Repositories, MSR, pages 153-162.
  6. Farias, M. A. F., Ortins, P., Novais, R., Colac¸o, M. J., and Mendonca, M. (2014). Recovering valuable information behaviour from oss contributors: An exploratory study. In SEKE, pages 474-478.
  7. Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996). The kdd process for extracting useful knowledge from volumes of data. Commun. ACM, 39(11):27-34.
  8. Gill, A. J. and Oberlander, J. (2003). Perception of e-mail personality at zero-acquaintance: Extraversion takes care of itself; neuroticism is a worry.
  9. Heller, B., Marschner, E., Rosenfeld, E., and Heer, J. (2011). Visualizing collaboration and influence in the open-source software community. In MSR, pages 223-226.
  10. Lanza, M. and Ducasse, S. (2003). Polymetric views-a lightweight visual approach to reverse engineering. IEEE TSE, 29(9):782-795.
  11. Lanza, M., Marinescu, R., and Ducasse, S. (2005). ObjectOriented Metrics in Practice.
  12. Licorish, S. A. and MacDonell, S. G. (2014). Combining text mining and visualization techniques to study teams' behavioral processes. In MUD, pages 16-20.
  13. Mazza, R. (2009). Introduction to Information Visualization.
  14. Müller, C., Reina, G., Burch, M., and Weiskopf, D. (2010). Subversion statistics sifter. In ICAVC, pages 447-457. Springer-Verlag.
  15. Murgia, A., Tourani, P., Adams, B., and Ortu, M. (2014). Do developers feel emotions? an exploratory analysis of emotions in software artifacts. In MSR, pages 262- 271. ACM.
  16. NETCRAFT (2013). Web Server Survey. NetCraft Website. http://news.netcraft.com/archives/2013/06/06/ june-2013-web-server-survey-3.html/.
  17. Novais, R., Nunes, C., Garcia, A., and Mendonca, M. (2013a). Sourceminer evolution: A tool for supporting feature evolution comprehension. In ICSM, pages 508-511.
  18. Novais, R. L., Torres, A., Mendes, T. S., Mendonc¸a, M., and Zazworka, N. (2013b). Software evolution visualization: A systematic mapping study. IST, 55(11):1860 - 1883.
  19. Pattison, D. S., Bird, C. A., and Devanbu, P. T. (2008). Talk and work: A preliminary report. In MSR, pages 113- 116. ACM.
  20. Rigby, P. C. and Hassan, A. E. (2007). What can oss mailing lists tell us? a preliminary psychometric text analysis of the apache developer mailing list. In MSR. IEEE Computer Society.
  21. Sjoberg, D., Yamashita, A., Anda, B., Mockus, A., and Dyba, T. (2013). Quantifying the effect of code smells on maintenance effort. TSE, 39(8):1144-1156.
  22. Witte, R., Li, Q., Zhang, Y., and Rilling, J. (2008). Text mining and software engineering: an integrated source code and document analysis approach. Soft. IET, 2(1):3-16.
  23. Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., and Wesslén, A. (2012). Experimentation in Software Engineering: An Introduction. Springer.
Download


Paper Citation


in Harvard Style

Farias M., Novais R., Ortins P., Colaço M. and Mendonça M. (2015). Analyzing Distributions of Emails and Commits from OSS Contributors through Mining Software Repositories - An Exploratory Study . In Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-097-0, pages 303-310. DOI: 10.5220/0005368603030310


in Bibtex Style

@conference{iceis15,
author={Mário Farias and Renato Novais and Paulo Ortins and Methanias Colaço and Manoel Mendonça},
title={Analyzing Distributions of Emails and Commits from OSS Contributors through Mining Software Repositories - An Exploratory Study},
booktitle={Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2015},
pages={303-310},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005368603030310},
isbn={978-989-758-097-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - Analyzing Distributions of Emails and Commits from OSS Contributors through Mining Software Repositories - An Exploratory Study
SN - 978-989-758-097-0
AU - Farias M.
AU - Novais R.
AU - Ortins P.
AU - Colaço M.
AU - Mendonça M.
PY - 2015
SP - 303
EP - 310
DO - 10.5220/0005368603030310