Authors:
Innocent Mbona
and
Jan Eloff
Affiliation:
Department of Computer Science, University of Pretoria, Lynwood Road, Pretoria, South Africa
Keyword(s):
Cyber Security, Real-World Data, Synthetic Data, Zero-Day Attack, Network Intrusion Detection Systems.
Abstract:
Discovering Cyber security threats is becoming increasingly complex, if not impossible! Recent advances in artificial intelligence (AI) can be leveraged for the intelligent discovery of Cyber security threats. AI and machine learning (ML) models depend on the availability of relevant data. ML based Cyber security solutions should be trained and tested on real-world attack data so that solutions produce trusted results. The problem is that most organisations do not have access to useable, relevant, and reliable real-world data. This problem is exacerbated when training ML models used to discover novel attacks, such as zero-day attacks. Furthermore, the availability of Cyber security data sets is negatively affected by privacy laws and regulations. The solution proposed in this paper is a methodological approach that guides organisations in developing Cyber security ML solutions, called CySecML. CySecML provides guidance for obtaining or generating synthetic data, checking data quality
, and identifying features that optimise ML models. Network Intrusion Detection Systems (NIDS) were employed to illustrate the convergence of Cyber security and AI concepts.
(More)