An Asynchronous Federated Learning Approach for a Security Source Code Scanner

Sabrina Kall, Slim Trabelsi

Abstract

Hard-coded tokens and secrets leaked through source code published on open-source platforms such as Github are a pervasive security threat and a time-consuming problem to mitigate. Prevention and damage control can be sped up with the aid of scanners to identify leaks, however such tools tend to have low precision, and attempts to improve them through the use of machine learning have been hampered by the lack of training data, as the information the models need to learn from is by nature meant to be kept secret by its owners. This problem can be addressed with federated learning, a machine learning paradigm allowing models to be trained on local data without the need for its owners to share it. After local training, the personal models can be merged into a combined model which has learned from all available data for use by the scanner. In order to optimize local machine learning models to better identify leaks in code, we propose an asynchronous federated learning system combining personalization techniques for local models with merging and benchmarking algorithms for the global model. We propose to test this new approach on leaks collected from the code-sharing platform Github. This use case demonstrates the impact on the accuracy of the local models employed by the code scanners when we apply our new proposed approach, balancing federation and personalization to handle often highly diverse and unique datasets.

Download


Paper Citation


in Harvard Style

Kall S. and Trabelsi S. (2021). An Asynchronous Federated Learning Approach for a Security Source Code Scanner.In Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-491-6, pages 572-579. DOI: 10.5220/0010300305720579


in Bibtex Style

@conference{icissp21,
author={Sabrina Kall and Slim Trabelsi},
title={An Asynchronous Federated Learning Approach for a Security Source Code Scanner},
booktitle={Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2021},
pages={572-579},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010300305720579},
isbn={978-989-758-491-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - An Asynchronous Federated Learning Approach for a Security Source Code Scanner
SN - 978-989-758-491-6
AU - Kall S.
AU - Trabelsi S.
PY - 2021
SP - 572
EP - 579
DO - 10.5220/0010300305720579