6 CONCLUSION AND FUTURE
WORKS
When developing Android mobile applications, it is
essential to adopt security-focused practices, from
early stages, during the overall development cycle,
and it is important to receive valuable automated tool
support. One way to support app developers, in
identifying source code vulnerabilities, is by apply-
ing AI methods. This study presents a dataset called
LVDAndro, which contains over 20 million distinct
source code samples, labelled based on CWE-IDs, for
identifying Android source code vulnerabilities. The
dataset can be used to train machine learning mod-
els to predict vulnerabilities, achieving 94% accuracy
in binary and multi-class classification, with 0.94 and
0.93 F1-Scores, respectively. The dataset is available
on GitHub and ongoing efforts are underway to ex-
pand it and increase sample sizes for deeper learning
models. The addition of more scanners can further
increase the model’s accuracy. Adopting security-
focused practices and receiving automated tool sup-
port is important for developing secure Android apps.
REFERENCES
Allix, K., Bissyand
´
e, T. F., Klein, J., and Le Traon, Y.
(2016). Androzoo: Collecting millions of android
apps for the research community. In Proceedings of
the 13th International Conference on Mining Software
Repositories, MSR ’16, page 468–471, New York,
NY, USA. ACM.
Bagheri, H., Kang, E., Malek, S., and Jackson, D. (2018). A
formal approach for detection of security flaws in the
android permission system. Formal Aspects of Com-
puting, 30(5):525–544.
Bagheri, H., Sadeghi, A., Garcia, J., and Malek, S. (2015).
Covert: Compositional analysis of android inter-app
permission leakage. IEEE transactions on Software
Engineering, 41(9):866–886.
Calzavara, S., Grishchenko, I., and Maffei, M. (2016).
Horndroid: Practical and sound static analysis of an-
droid applications by smt solving. In 2016 IEEE Euro-
pean Symposium on Security and Privacy (EuroS&P),
pages 47–62, Saarbruecken, Germany. IEEE.
Challande, A., David, R., and Renault, G. (2022). Build-
ing a commit-level dataset of real-world vulnerabili-
ties. In Proceedings of the Twelfth ACM Conference
on Data and Application Security and Privacy, CO-
DASPY ’22, page 101–106, New York, USA. ACM.
Gajrani, J., Tripathi, M., Laxmi, V., Somani, G., Zemmari,
A., and Gaur, M. S. (2020). Vulvet: Vetting of vulner-
abilities in android apps to thwart exploitation. Digital
Threats: Research and Practice, 1(2):1–25.
Ghaffarian, S. M. and Shahriari, H. R. (2017). Software
vulnerability analysis and discovery using machine-
learning and data-mining techniques: A survey. ACM
Comput. Surv., 50(4).
Goa
¨
er, O. L. (2020). Enforcing green code with android
lint. In Proceedings of the 35th IEEE/ACM Interna-
tional Conference on Automated Software Engineer-
ing Workshops, ASE ’20, page 85–90, New York, NY,
USA. ACM.
Hanif, H. and Maffeis, S. (2022). Vulberta: Simplified
source code pre-training for vulnerability detection. In
2022 International Joint Conference on Neural Net-
works (IJCNN), pages 1–8.
Mitra, J. and Ranganath, V.-P. (2017). Ghera: A repository
of android app vulnerability benchmarks. In Proceed-
ings of the 13th International Conference on Predic-
tive Models and Data Analytics in Software Engineer-
ing, PROMISE, page 43–52, New York, NY, USA.
ACM.
Namrud, Z., Kpodjedo, S., and Talhi, C. (2019). Androvul:
A repository for android security vulnerabilities. In
Proceedings of the 29th Annual International Confer-
ence on Computer Science and Software Engineering,
CASCON ’19, page 64–71, USA. IBM Corp.
Qin, J., Zhang, H., Guo, J., Wang, S., Wen, Q., and Shi,
Y. (2020). Vulnerability detection on android apps–
inspired by case study on vulnerability related with
web functions. IEEE Access, 8:106437–106451.
Senanayake, J., Kalutarage, H., and Al-Kadri, M. O.
(2021). Android mobile malware detection using ma-
chine learning: A systematic review. Electronics,
10(13):1606.
Senanayake, J., Kalutarage, H., Al-Kadri, M. O., Petrovski,
A., and Piras, L. (2022). Developing secured android
applications by mitigating code vulnerabilities with
machine learning. In Proceedings of the 2022 ACM
on Asia Conference on Computer and Communica-
tions Security, ASIA CCS ’22, page 1255–1257, New
York, NY, USA. ACM.
Senanayake, J., Kalutarage, H., Al-Kadri, M. O., Petrovski,
A., and Piras, L. (2023). Android source code vulner-
ability detection: A systematic literature review. ACM
Comput. Surv., 55(9).
Shezan, F. H., Afroze, S. F., and Iqbal, A. (2017). Vulner-
ability detection in recent android apps: An empiri-
cal study. In 2017 International Conference on Net-
working, Systems and Security (NSysS), pages 55–63,
Dhaka, Bangladesh. IEEE.
Simonin, D. (2023). Fossdroid. https://fossdroid.com/. Ac-
cessed: 2023-01-02.
Souppaya, M., Scarfone, K., and Dodson, D. (2021). Secure
software development framework: Mitigating the risk
of software vulnerabilities. Technical report, NIST.
Statcounter (2023). Mobile operating system mar-
ket share worldwide. https://gs.statcounter.com/
os-market-share/mobile/worldwide/. Accessed:
2023-01-02.
Statista (2023). Average number of new an-
droid app releases via google play per month.
https://www.statista.com/statistics/1020956/
android-app-releases-worldwide/. Accessed:
2023-02-02.
SECRYPT 2023 - 20th International Conference on Security and Cryptography
666