BloatLibD: Detecting Bloat Libraries in Java Applications

Agrim Dewan, Poojith Rao, Balwinder Sodhi, Ritu Kapur

Abstract

Third-party libraries (TPLs) provide ready-made implementations of various software functionalities and are frequently used in software development. However, as software development progresses through various iterations, there often remains an unused set of TPLs referenced in the application’s distributable. These unused TPLs become a prominent source of software bloating and are responsible for excessive consumption of resources, such as CPU cycles, memory, and mobile devices’ battery-usage. Thus, the identification of such bloat-TPLs is essential. We present a rapid, storage-efficient, obfuscation-resilient method to detect the bloatTPLs. Our approach’s novel aspects are i) Computing a vector representation of a .class file using a model that we call Jar2Vec. The Jar2Vec model is trained using the Paragraph Vector Algorithm. ii) Before using it for training the Jar2Vec models, a .class file is converted to a normalized form via semantics-preserving transformations. iii) A Bloated Library Detector (BloatLibD) developed and tested with 27 different Jar2Vec models. These models were trained using different parameters and >30000 .class files taken from >100 different Java libraries available at MavenCentral.com. BloatLibD achieves an accuracy of 99% with an F1 score of 0.968 and outperforms the existing tools, viz., LibScout, LiteRadar, and LibD with an accuracy improvement of 74.5%, 30.33%, and 14.1%, respectively. Compared with LibD, BloatLibD achieves a response time improvement of 61.37% and a storage reduction of 87.93%. Our program artifacts are available at https://bit.ly/2WFALXf.

Download


Paper Citation


in Harvard Style

Dewan A., Rao P., Sodhi B. and Kapur R. (2021). BloatLibD: Detecting Bloat Libraries in Java Applications. In Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-758-508-1, pages 126-137. DOI: 10.5220/0010459401260137


in Bibtex Style

@conference{enase21,
author={Agrim Dewan and Poojith Rao and Balwinder Sodhi and Ritu Kapur},
title={BloatLibD: Detecting Bloat Libraries in Java Applications},
booktitle={Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},
year={2021},
pages={126-137},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010459401260137},
isbn={978-989-758-508-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
TI - BloatLibD: Detecting Bloat Libraries in Java Applications
SN - 978-989-758-508-1
AU - Dewan A.
AU - Rao P.
AU - Sodhi B.
AU - Kapur R.
PY - 2021
SP - 126
EP - 137
DO - 10.5220/0010459401260137