Parallel Micro-Batching and Scalable Inferencing for ML-Based

Malicious Trafﬁc Detection

Achmad Basuki

, Widhi Yahya

, Dzaki R. Malik

, Rizal Setya Perdana

, Kasyful Amron

Achmad Husni Thamrin

, Andrey Ferriyan

and Muhammad Niswar

Universitas Brawijaya, Indonesia

Keio University, Japan

Universitas Hasanuddin, Indonesia

ﬁ

Keywords:

NIDS, Machine Learning, High-speed Network, Passive Optical Tapping, Scalability.

Abstract:

Network intrusion detection systems based on machine learning (ML-IDS) face signiﬁcant challenges in high-

speed network environments, such as gigabit-scale packet capture and real-time inference under dynamic

trafﬁc conditions. Efﬁciently handling these challenges is critical to maintaining accurate and timely detection

without overwhelming the system resources. This paper presents a scalable ML-IDS architecture featuring a

novel parallel micro-batching inference framework integrated with passive optical tapping for non-intrusive

trafﬁc monitoring. The proposed inference architecture is critical to achieving a balance between high classiﬁ-

cation accuracy and computational efﬁciency. Experimental results demonstrate a 2.65× throughput improve-

ment over traditional sequential processing while maintaining sub-5ms decision times, even under variable

trafﬁc loads. Furthermore, the architecture supports horizontal scaling to accommodate growing network

demands, ensuring sustained low-latency detection performance. These contributions establish a robust foun-

dation for deploying ML-IDS in high-speed network environments.

1 INTRODUCTION

The rapid advancement of computer networking tech-

nologies, including digital twins, has enabled the sim-

ulation and optimization of complex systems across

industrial, residential, banking, and educational sec-

tors—leading to a surge in network users and grow-

ing attack surfaces. As a result, cyber threats have be-

come increasingly sophisticated, posing serious risks

to individuals, organizations, and society by compro-

mising the core tenets of information security: Con-

ﬁdentiality, Integrity, and Availability (CIA), with

consequences ranging from ﬁnancial and reputational

damage to threats to critical infrastructure and human

safety (Thakkar and Lohiya, 2023). In response, insti-

tutions such as universities and industries are enhanc-

ing their network security to provide safer Internet

access by managing both controlled factors (e.g., hu-

man behavior, access regulation through formal pro-

cedures, and policy enforcement) and uncontrolled

ones, such as anomalous trafﬁc and malware propa-

gation. To address these challenges, organizations in-

creasingly rely on security technologies like ﬁrewalls,

Intrusion Detection Systems (IDS), and Intrusion Pre-

vention Systems (IPS), which play a critical role in

identifying and mitigating threats to strengthen over-

all network resilience (Admass et al., 2024).

Network Intrusion Detection Systems (NIDS) are

essential for detecting and responding to anomalous

trafﬁc using two primary approaches: signature-based

and behavioral (anomaly-based) detection. While

signature-based methods rely on predeﬁned rules to

identify known threats, they fail to detect novel at-

tacks until updates are applied, creating security

gaps. In contrast, behavioral detection leverages ar-

tiﬁcial intelligence (AI) and machine learning (ML)

to model normal network behavior and identify de-

viations, enabling real-time detection of previously

unseen threats. This adaptability makes ML-based

NIDS critical in combating evolving cyber threats.

Recent studies show that integrating advanced ML

techniques signiﬁcantly improves detection accuracy

and reduces false positives, reinforcing their role in

modern cybersecurity (Abdulganiyu et al., 2023).

ML-based anomaly detection has emerged as a

promising approach for enhancing network security,

Basuki, A., Yahya, W., Malik, D. R., Perdana, R. S., Amron, K., Thamrin, A. H., Ferriyan, A. and Niswar, M.

Parallel Micro-Batching and Scalable Inferencing for ML-Based Malicious Trafﬁc Detection.

DOI: 10.5220/0014275200004928

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 1st International Conference on Research and Innovations in Information and Engineering Technology (RITECH 2025), pages 157-163

ISBN: 978-989-758-784-9

157

offering the ability to learn complex patterns, adapt

to evolving threats, and outperform traditional rule-

based systems that rely on static signatures. While

numerous studies (Dina and Manivannan, 2021; Ab-

delmoumin et al., 2022; Sahani et al., 2023; Yang

and Shami, 2022; Gite et al., 2023; Yu et al., 2024;

Gomez et al., 2023; Zeng et al., 2024; Theofanous

et al., 2024; Kondaiah et al., 2024; Siyyal et al., 2022;

Zang et al., 2024; Ucci et al., 2021; Berbecaru and

Petraglia, 2023) have explored ML-based Intrusion

Detection Systems (IDS), most evaluate performance

in simulation environments, limiting real-world ap-

plicability. Even those that use real testbeds (Siyyal

et al., 2022; Zang et al., 2024; Ucci et al., 2021;

Berbecaru and Petraglia, 2023) often overlook critical

challenges in high-throughput networks—particularly

computational overload and excessive inference la-

tency—leading to delayed detection and reduced ef-

fectiveness under realistic trafﬁc loads.

To address these limitations, we propose a novel

ML-based NIDS architecture designed for low-

latency, high-throughput operation. The system is

divided into two optimized components: (1) traf-

ﬁc capture and preprocessing, and (2) model infer-

ence. For the ﬁrst, we employ passive optical tap-

ping to capture trafﬁc without disrupting production

networks, combined with the Argus tool for real-time

feature extraction. This ensures timely, lightweight

data preparation with minimal overhead. For infer-

ence, we introduce a parallel micro-batching archi-

tecture that improves detection throughput and scal-

ability, enabling rapid, accurate threat identiﬁcation

in high-speed environments. By co-optimizing both

stages for minimal decision time, our design enhances

system responsiveness and practical deployability in

operational networks.

This paper proposes a high-throughput ML-based

NIDS architecture designed for real-time operation

in high-speed networks without disrupting production

trafﬁc. The system integrates passive optical tapping

with Argus for efﬁcient, real-time feature extraction,

ensuring timely and relevant input to the detection

model. To maximize scalability and detection per-

formance, we introduce a parallel micro-batching ar-

chitecture for inference, enabling rapid processing of

streaming trafﬁc under demanding conditions. To-

gether, these contributions enhance the practicality,

responsiveness, and accuracy of intrusion detection in

operational network environments.

2 RELATED WORK

In the realm of simulated environments for Ma-

chine Learning-based Intrusion Detection Systems

(IDS), several studies have provided valuable

insights.Traditional approaches employ hybrid

signature-anomaly detection (Dina and Manivannan,

2021), ensemble methods combining PCA with

deep learning (Abdelmoumin et al., 2022), and

diverse ML techniques including SVMs, decision

trees, and neural networks (Sahani et al., 2023;

Yang and Shami, 2022). Recent advances explore

specialized architectures such as graph-based sys-

tems (Yu et al., 2024), unsupervised clustering with

HDBSCAN (Gomez et al., 2023), and ensemble

models integrating deep learning with self-attention

mechanisms (Kondaiah et al., 2024). However,

these simulation-based studies focus primarily on

maximizing detection accuracy while neglecting

critical real-world constraints: inference latency,

trafﬁc capture efﬁciency, and end-to-end detection

time optimization.

Real testbed implementations reveal additional

limitations. Studies using traditional packet capture

methods (Wireshark, tcpdump) (Siyyal et al., 2022;

Kondaiah et al., 2024) face scalability challenges at

high network speeds, while computationally intensive

approaches like NLP-inspired deep learning (Zang

et al., 2024) introduce prohibitive processing delays.

Existing monitoring solutions integrate multiple tools

(Suricata, Zeek) (Berbecaru and Petraglia, 2023) but

suffer from deployment complexity and lack opti-

mization for high-throughput scenarios. Notably, no

prior work addresses the joint optimization of passive

trafﬁc capture, ML inference efﬁciency, and detection

time minimization speciﬁcally for 10Gbps networks.

Our approach uniquely combines passive optical tap-

ping for zero-latency capture with a two-tier architec-

ture optimizing both network-level ﬁltering and ML-

based detection, explicitly targeting sub-second re-

sponse times in high-speed operational environments.

3 SCALABLE ML-BASED IDS

FOR HIGHT-SPEED

NETWORKS

In this section, we explain the overall design of our

proposed solution and describe it in detail with a pri-

mary goal to achieve real-time detection with high

throughput.

RITECH 2025 - The International Conference on Research and Innovations in Information and Engineering Technology

158

3.1 Design Issues

3.1.1 Gigabit Scale Capturing System

In high-speed network environments, such as enter-

prise or campus network backbones, the sheer volume

of trafﬁc necessitates a robust capturing system. The

system must collect all trafﬁc with zero packet loss to

ensure comprehensive monitoring. Subsequently, live

preprocessing of this high-bandwidth data stream is

required to reduce its size and complexity. Finally,

the preprocessing must efﬁciently extract only the

features relevant to the subsequent inference model,

presenting a signiﬁcant challenge given the real-time

constraints. Meeting these requirements simultane-

ously is crucial for effective intrusion detection.

3.1.2 Near-Real Time Inferencing System

In high-speed networks, the data stream captured is

typically processed through sequential inferencing,

where each network ﬂow is analyzed in isolation.

While this serial processing approach is conceptually

straightforward, it introduces substantial delays. The

cumulative effect of processing each ﬂow individu-

ally results in signiﬁcantly increased response times,

which hinders the timely detection and mitigation of

security threats. Such delays can have severe conse-

quences, as they provide attackers with the opportu-

nity to exploit vulnerabilities and inﬂict considerable

damage before the intrusion detection system can re-

spond effectively.

3.2 Proposed IDS System

3.2.1 Passive Capture Techniques with Live

Preprocessing for High-Speed Networks

To address gigabit-scale data capture, we employ an

LC-to-LC optical tap (OP-TAP) as shown in Fig. 1a,

enabling seamless integration into high-speed enter-

prise or campus networks. The device uses dual live

connections (LIVE1 and LIVE2) to passively cap-

ture both transmitted (TX) and received (RX) trafﬁc

without introducing computational overhead or dis-

rupting production networks—ensuring lossless, real-

time monitoring. Captured trafﬁc is forwarded to a

dedicated server (Server Netmon) for live preprocess-

ing, enhancing the efﬁciency of downstream analy-

sis and intrusion detection as shown in Fig. 1b. This

setup forms the second tier of our two-tier architec-

ture, where the ﬁrst tier (ﬁrewall) performs initial

threat ﬁltering, reducing load on the detection system.

The second tier—comprising the OP-TAP and Server

Netmon—handles trafﬁc capture, preprocessing, and

real-time analysis as a uniﬁed, low-latency pipeline.

This separation of concerns ensures scalability and re-

sponsiveness in high-throughput environments, meet-

ing the demanding requirements of modern network

security.

Netmon Server

LAN

ROUTER

Firewall

Production Network Link

Internet

NAT

ROUTER

LAN

Captured TX Network Traﬃc

Captured RX Network Traﬃc

(a) Two-Tier Intrusion Detection System Architecture

Leveraging Firewall and OP-TAP.

(b) Capturing with passive OP-TAP.

Figure 1: Network capture system implementation.

3.2.2 Parallel Micro-Batching and Scalable

Inferencing Architecture

Fig. 2 illustrates the system architecture, which is or-

ganized into three key modules to enhance through-

put by processing data in micro-batches rather than

streams, thereby reducing latency. The workﬂow

comprises two main components: a micro-batching

module and an inferencing module, with each in-

ferencing instance connected to a machine learn-

ing model for classifying trafﬁc as benign or mali-

cious. The micro-batching module receives incoming

data via multiple threads, aggregates it into ﬁxed-size

batches, and publishes them to a message queue under

a speciﬁc topic. The inferencing module subscribes to

this topic, polls the queue, and forwards each batch to

the ML model for prediction. This decoupled, parallel

design enables efﬁcient load balancing, low-latency

processing, and scalable real-time inference.

Parallel Micro-Batching and Scalable Inferencing for ML-Based Malicious Trafﬁc Detection

159

Figure 2: Architecture of the proposed method.

4 IMPLEMENTATION

4.1 Implementation of Gigabit Scale

Capturing System

To implement the gigabit-scale capture system, we

deployed the setup shown in Fig. 1b. An ICX-

7850 switch connects via ﬁber to a WDM Opti-

cal Tap 80/20, which passively splits trafﬁc—80%

directed to the ﬁrewall, 20% to the trafﬁc capture

server—enabling non-intrusive monitoring with zero

impact on production performance. The capture

server (Netmon) is an HP ProLiant DL380 Gen10

equipped with an Intel Xeon Silver 4214R (12-core),

32 GB RAM, NVIDIA RTX A2000 12 GB GPU, and

an Intel Ethernet Converged Network Adapter X710-

DA4 (quad-port 10GbE SFP+). Two ports are used

to capture TX and RX streams from the tap, while

the system runs Ubuntu 22.04 and leverages Argus

(https://openargus.org/) for ﬂow merging and prepro-

cessing. Using the -i dup:interface-name ﬂag,

Argus aggregates bidirectional trafﬁc into a uniﬁed

ﬂow and extracts features in real time. This conﬁg-

uration ensures lossless capture, high throughput, and

minimal overhead—aligning with our design goals of

passive, scalable, and low-latency preprocessing in

10 Gbps environments.

4.2 Implementation of Near-Real-Time

Inferencing System

The near-real-time inferencing system leverages

open-source technologies to achieve high through-

put and low latency. ZeroMQ (https://zeromq.org/)

enables asynchronous, non-blocking communication

between stream ingestion workers and the micro-

batching module, supporting scalable data transfer

in high-speed networks. Redis (https://redis.io/) acts

as an in-memory FIFO queue, temporarily storing

micro-batched data with minimal delay, thereby pre-

venting processing bottlenecks. Inference results are

persisted in DuckDB (https://duckdb.org/), an embed-

ded analytical database that supports efﬁcient query-

ing and lightweight data management for downstream

analysis. Together, these components form a stream-

lined, scalable pipeline that implements the parallel

micro-batching architecture Fig. 3, signiﬁcantly re-

ducing response times and enhancing the system’s de-

tection capabilities.

5 RESULTS AND DISCUSSION

This section evaluates the effectiveness of our scal-

able ML-based IDS with a focus on optimizing infer-

encing time without compromising detection perfor-

mance.

5.1 Survey of Machine Learning

Models and Feature Selection for

Optimized Inferencing

The evaluation leverages multiple network traf-

ﬁc datasets—NF-CSE-CIC-IDS2018-v2, NF-ToN-

IoT-v2, CIC-IDS2017, NF-UNSW-NB15-v2, and

HIKARI 2021—to enhance model generalization

across diverse trafﬁc and attack patterns. The com-

bined dataset is split 80%/20% for training and test-

ing, ensuring robust performance assessment.

Table 1 summarizes the performance of various

ML models across key metrics. Gradient boosting

methods—XGBoost (XGB), LightGBM (LGBM),

RITECH 2025 - The International Conference on Research and Innovations in Information and Engineering Technology

160

and CatBoost (CatB)—achieve near-perfect or per-

fect scores, with XGB attaining 1.00 across all met-

rics, attributed to its ability to model complex feature

interactions via sequential tree learning. In contrast,

simpler models such as Naive Bayes (NVB) and Lin-

ear Discriminant Analysis (LDA) underperform, with

NVB reaching only 78% accuracy.

Given these results, XGBoost (XGB) is selected

for deployment due to its perfect accuracy and favor-

able inference characteristics: (1) native handling of

categorical features, (2) effective tree pruning that re-

duces model size, and (3) strong parallelization sup-

port—well aligned with our micro-batching architec-

ture. These attributes enable high accuracy with min-

imal inference latency in high-speed networks. For

feature selection, we leverage importance weights

from training, as shown in Fig. 3, retaining only

the most impactful features (e.g., Rate, Dur, Dport,

State). This approach reduces dimensionality, stream-

lines preprocessing, and enhances model efﬁciency

for near-real-time detection.

Table 1: Model Survey Results.

Model Accuracy Precision Recall F1 Score

NVB 0.78 0.76 0.82 0.79

LR 0.92 0.91 0.93 0.92

LDA 0.88 0.87 0.89 0.88

SVM 0.93 0.92 0.94 0.93

XGB 1.00 1.00 1.00 1.00

LGBM 0.99 0.99 0.99 0.99

CatB 0.99 0.99 0.99 0.99

AdaB 0.98 0.97 0.98 0.97

Figure 3: Feature Importance (Weight).

5.2 Performance Evaluation of

Inferencing and Capturing System

This subsection presents experimental results on in-

ferencing latency and capturing time. It compares the

performance of the proposed parallel micro-batching

inferencing system against traditional sequential in-

ferencing, showcasing the improvements in process-

ing speed and system responsiveness enabled by our

architecture. As illustrated in Figure 4, the live in-

ferencing process provides real-time network perfor-

mance metrics alongside AI-based classiﬁcation re-

sults, demonstrating the system’s capability to ana-

lyze trafﬁc ﬂows and detect anomalies efﬁciently.

Figure 4: Live inferencing process and performance sum-

mary.

Figs. 5a and 5b analyze the trade-off between

throughput (ﬂow/s) and decision time (ms) across

batch size and worker count conﬁgurations. The

10b4w setup achieves an optimal balance, delivering

730 ﬂows/s at just 5 ms decision time—representing

a 265% throughput gain over the baseline 1b1w (200

ﬂows/s). Larger conﬁgurations (e.g., 20b3w, 50b3w)

further increase throughput (up to 1300 ﬂows/s,

+450%), but at the cost of elevated decision times

(tens of ms), indicating diminishing returns due to

Netmon server limitations.

This highlights a key design insight: while single-

server performance is strong, scalability is bounded

by resource saturation. To address this, our archi-

tecture supports distributed deployment with multi-

ple inferencing servers—a path for maintaining low

latency under growing workloads. This forward-

looking scalability is a core strength, enabling adap-

tation to larger, more complex networks. Future work

will explore multi-server conﬁgurations to optimize

the throughput-latency trade-off and fully realize the

system’s scalable potential.

6 CONCLUSIONS

The proposed architecture advances high-speed net-

work intrusion detection through three key contri-

butions. First, the parallel micro-batching inferenc-

ing system achieves up to 730 ﬂows/s—a 265% im-

provement over sequential processing—while main-

taining a low decision time of 5 ms, effectively bal-

ancing throughput and latency. Second, XGBoost

Parallel Micro-Batching and Scalable Inferencing for ML-Based Malicious Trafﬁc Detection

161

(a) Average detecting time per conﬁguration. (b) Average throughput per conﬁguration.

Figure 5: Performance comparison of detecting time and throughput.

delivers perfect accuracy, precision, recall, and F1-

score, enabling high-conﬁdence detection without

compromising inference speed. Third, the architec-

ture is inherently scalable, supporting distributed de-

ployment across multiple inferencing servers to sus-

tain performance under growing network loads. To-

gether, these results demonstrate a practical, high-

performance NIDS solution for real-time, large-scale

environments.

Future work will explore multi-server scaling

and distributed processing frameworks to further en-

hance throughput. We will also integrate MLOps

practices for automated model monitoring, retrain-

ing, and deployment, ensuring long-term reliabil-

ity. Additionally, federated learning will be inves-

tigated to enable privacy-preserving, collaborative

model training across distributed network nodes, im-

proving adaptability and robustness in heterogeneous

environments.

ACKNOWLEDGEMENTS

This work is part of the project “Real-time Mali-

cious TLS Trafﬁc Detection using a Machine Learn-

ing Classiﬁer,” supported by Keio University and the

APNIC Foundation under the Community Based Re-

search (CBR) scheme (2023–2025). The authors used

AI tools solely for language editing, including proof-

reading for grammar, syntax, and clarity. All research

design, methods, implementation, and results were in-

dependently developed by the authors.

REFERENCES

Abdelmoumin, G., Rawat, D. B., and Rahman, A. (2022).

On the Performance of Machine Learning Models for

Anomaly-Based Intelligent Intrusion Detection Sys-

tems for the Internet of Things. IEEE Internet of

Things Journal, 9(6):4280–4290.

Abdulganiyu, O. H., Ait Tchakoucht, T., and Saheed, Y. K.

(2023). A Systematic Literature Review for Network

Intrusion Detection System (IDS). International jour-

nal of information security, 22(5):1125–1162.

Admass, W. S., Munaye, Y. Y., and Diro, A. A. (2024). Cy-

ber security: State of the art, challenges and future di-

rections. Cyber Security and Applications, 2:100031.

Berbecaru, D. G. and Petraglia, G. (2023). TLS-Monitor:

A Monitor for TLS Attacks. In 2023 IEEE 20th

Consumer Communications & Networking Confer-

ence (CCNC), pages 1–6.

Dina, A. S. and Manivannan, D. (2021). Intrusion detection

based on Machine Learning techniques in computer

networks. Internet of Things, 16:100462.

Gite, P., Chouhan, K., Murali Krishna, K., Kumar Nayak,

C., Soni, M., and Shrivastava, A. (2023). ML Based

Intrusion Detection Scheme for various types of at-

tacks in a WSN using C4.5 and CART classiﬁers. Ma-

terials Today: Proceedings, 80:3769–3776.

Gomez, G., Kotzias, P., Dell’Amico, M., Bilge, L., and Ca-

ballero, J. (2023). Unsupervised Detection and Clus-

tering of Malicious TLS Flows. Security and Commu-

nication Networks, 2023(1):3676692.

Kondaiah, C., Pais, A. R., and Rao, R. S. (2024). Enhanced

Malicious Trafﬁc Detection in Encrypted Communi-

cation Using TLS Features and A Multi-Class Classi-

ﬁer Ensemble. Journal of Network and Systems Man-

agement, 32(4):76.

Sahani, N., Zhu, R., Cho, J.-H., and Liu, C.-C. (2023). Ma-

chine Learning-based Intrusion Detection for Smart

Grid Computing: A Survey. ACM Trans. Cyber-Phys.

Syst., 7(2).

Siyyal, S. A., Khuawar, F. Y., Saba, E., Memon, A. L., and

Shaikh, M. R. (2022). Analyzing ML-Based IDS Over

Real-Trafﬁc. International Journal of Innovations in

Science & Technology, 4(3):621–640.

Thakkar, A. and Lohiya, R. (2023). A Review on

Challenges and Future Research Directions for Ma-

chine Learning-Based Intrusion Detection System.

Archives of Computational Methods in Engineering,

30(7):4245–4269.

Theofanous, A., Papadogiannaki, E., Shevtsov, A., and

Ioannidis, S. (2024). Fingerprinting the Shadows: Un-

RITECH 2025 - The International Conference on Research and Innovations in Information and Engineering Technology

162

masking Malicious Servers with Machine Learning-

Powered TLS Analysis. In Proceedings of the ACM

Web Conference 2024, WWW ’24, page 1933–1944,

New York, NY, USA. Association for Computing Ma-

chinery.

Ucci, D., Sobrero, F., Bisio, F., and Zorzino, M. (2021).

Near-Real-Time Anomaly Detection in Encrypted

Trafﬁc using Machine Learning Techniques. In 2021

IEEE Symposium Series on Computational Intelli-

gence (SSCI), pages 01–08.

Yang, L. and Shami, A. (2022). IDS-ML: An open source

code for Intrusion Detection System development us-

ing Machine Learning. Software Impacts, 14:100446.

Yu, L., Tao, J., Xu, Y., Sun, W., and Wang, Z. (2024).

Tls ﬁngerprint for encrypted malicious trafﬁc detec-

tion with attributed graph kernel. Computer Networks,

247:110475.

Zang, X., Wang, T., Zhang, X., Gong, J., Gao, P., and

Zhang, G. (2024). Encrypted Malicious Trafﬁc Detec-

tion Based on Natural Language Processing and Deep

Learning. Computer Networks, 250:110598.

Zeng, Z., Xun, P., Peng, W., and Zhao, B. (2024). Toward

identifying malicious encrypted trafﬁc with a causal-

ity detection system. Journal of Information Security

and Applications, 80:103644.

Parallel Micro-Batching and Scalable Inferencing for ML-Based Malicious Trafﬁc Detection

163