A Supervised Generative Topic Model to Predict Bug-fixing Time on Open Source Software Projects

Pasquale Ardimento, Nicola Boffoli

2022

Abstract

During software maintenance activities an accurate prediction of the bug-fixing time can support software managers to better resources and time allocation. In this work, each bug report is endowed with a response variable (bug-fixing time), external to its words, that we are interested in predicting. To analyze the bug reports collections, we used a supervised Latent Dirichlet Allocation (sLDA), whose goal is to infer latent topics that are predictive of the response. The bug reports and the responses are jointly modeled, to find latent topics that will best predict the response variables for future unlabeled bug reports. With a fitted model in hand, we can infer the topic structure of an unlabeled bug report and then form a prediction of its response. sLDA adds to LDA a response variable connected to each bug report. Two different variants of the bag-of-words (BoW) model are used as baseline discriminative algorithms and also an unsupervised LDA is considered. To evaluate the proposed approach the defect tracking dataset of LiveCode, a well-known and large dataset, was used. Results show that SLDA improves recall of the predicted bug-fixing times compared to other BoW single topic or multi-topic supervised algorithms.

Download


Paper Citation


in Harvard Style

Ardimento P. and Boffoli N. (2022). A Supervised Generative Topic Model to Predict Bug-fixing Time on Open Source Software Projects. In Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-758-568-5, pages 233-240. DOI: 10.5220/0011113100003176


in Bibtex Style

@conference{enase22,
author={Pasquale Ardimento and Nicola Boffoli},
title={A Supervised Generative Topic Model to Predict Bug-fixing Time on Open Source Software Projects},
booktitle={Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},
year={2022},
pages={233-240},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011113100003176},
isbn={978-989-758-568-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
TI - A Supervised Generative Topic Model to Predict Bug-fixing Time on Open Source Software Projects
SN - 978-989-758-568-5
AU - Ardimento P.
AU - Boffoli N.
PY - 2022
SP - 233
EP - 240
DO - 10.5220/0011113100003176