loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Peerachai Banyongrakkul and Suronapee Phoomvuthisarn

Affiliation: Department of Statistics, Chulalongkorn University, Bangkok, Thailand

Keyword(s): Pull-Based Development, Pull Request, GitHub, Deep Learning, Multi-Output Learning, Classification.

Abstract: GitHub’s pull-based development model is widely used by software development teams to manage software complexity. Contributors create pull requests for merging changes into the main codebase, and integrators review these requests to maintain quality and stability. However, a high volume of pull requests can overwhelm integrators, causing feedback delays. Previous studies have built predictive models using traditional machine learning techniques with tabular data, but these may lose meaningful information. Additionally, relying solely on acceptance and latency predictions may not be sufficient for integrators. Reopened pull requests can add maintenance costs and burden already-busy developers. This paper proposes a novel multi-output deep learning-based approach that early predicts acceptance, latency, and reopening of pull requests, effectively handling various data sources, including tabular and textual data. Our approach also applies SMOTE and VAE techniques to address the highly i mbalanced nature of the pull request reopening. We evaluate our approach on 143,886 pull requests from 54 open-source projects across four well-known programming languages. The experimental results show that our approach significantly outperforms the randomized baseline. Moreover, our approach improves accuracy by 8.68%, precision by 1.01%, recall by 11.49%, and F1-score by 6.77% in acceptance prediction, and MMAE by 6.07% in latency prediction, while improving balanced accuracy by 9.43%, AUC by 9.37%, and TPR by 30.07% in reopening prediction over the existing approach. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.221.98.71

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Banyongrakkul, P. and Phoomvuthisarn, S. (2023). Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects. In Proceedings of the 18th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-665-1; ISSN 2184-2833, SciTePress, pages 163-174. DOI: 10.5220/0012125200003538

@conference{icsoft23,
author={Peerachai Banyongrakkul. and Suronapee Phoomvuthisarn.},
title={Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects},
booktitle={Proceedings of the 18th International Conference on Software Technologies - ICSOFT},
year={2023},
pages={163-174},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012125200003538},
isbn={978-989-758-665-1},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 18th International Conference on Software Technologies - ICSOFT
TI - Multi-Output Learning for Predicting Evaluation and Reopening of GitHub Pull Requests on Open-Source Projects
SN - 978-989-758-665-1
IS - 2184-2833
AU - Banyongrakkul, P.
AU - Phoomvuthisarn, S.
PY - 2023
SP - 163
EP - 174
DO - 10.5220/0012125200003538
PB - SciTePress