Authors:
Ahmed-Reda Rhazi
1
;
Oumayma Banouar
1
;
Fadel Toure
2
and
Said Raghay
1
Affiliations:
1
Laboratory of Applied Mathematics and Computer Science, Faculty of Sciences and Techniques, Cadi Ayyad University, Marrakesh, Morocco
;
2
Laboratory IT, Statistics, and Transdisciplinary Smart Technologies, University of Quebec, Trois-Rivières, Quebec, Canada
Keyword(s):
Software Defect Prediction, Recommender Systems, Software Metrics, Classes Similarities, Classes Recommendation for Unit Tests.
Abstract:
Defects prediction is an important step in the software development life cycle. Projects involving thousands of classes require the writing of unit tests for a significant number of classes, which is a costly and time-consuming process. Some research projects in this area have tried to predict defect-prone classes in order to better allocate testing effort in the relevant components. Algorithms such as neural networks and ensemble learning have been used to classify the project classes. Based on similarities, Recommender systems (RS) allow users to have customized recommendations in different domains, such as social media and e-commerce. This paper explores the usage of recommender systems in the prediction of software defects. Using a dataset of 14 open source systems containing 5883 Java classes, we compared the performance of content-based RS approaches applied to software defect prediction using software metrics as features, with classic classification algorithms such as SVM, KNN
, and ensemble learning algorithms. For the Content-based approach, the similarities are computed between software classes first with the standard software metrics and then with PCA (principal component analysis) extracted components. Finally, by aggregating the top-N most similar classes, the approach is capable of predicting whether the current class is defect-prone or not. The comparison is made using the Accuracy, Precision, and F-1 Score. The results show that the recommender systems approach can be a viable alternative to traditional machine learning methods in the classification and prediction of software defect classes.
(More)