rare and detectable by significant increments, which
may not apply to datasets with subtle anomalies or
highly skewed distributions.
5
CONCLUSIONS
This paper presents a multivariate autotuning method
that dynamically optimises the five hyperparameters
of the IF algorithm (S, T, F, D, Th), validated on a real
dataset of a water distribution network in 2024. The
proposed procedure allows the model to self-adjust
and also to do so obtaining better F1-score results and
processing time than using the default values that are
traditionally configured.
As an automatic procedure, it can be launched
periodically, ensuring that the IF model will be
adjusted to the data conditions at that instant. The
method provides a robust, adaptive and scalable
solution for real-time systems. Above all, it allows
independence from the skill and knowledge of the
expert fitting the model, providing a completely
independent and objective methodology. Compared
to approaches such as Extended IF or Deep IF, it
stands out for its simplicity and simultaneous
optimisation of all hyperparameters. It does not
require the introduction of new parameters such as
CIIF, which make the process more complex, nor
does it add high computational complexity such as
LSTM, since self-adjustment can be performed while
IF performs detection. Finally, the proposed method
allows the hyperparameters to be adjusted
continuously, unlike HIF or SCiForest, which
ensures that the algorithm will maximize its accuracy.
In the short term, future research will address the
sensitivity of the method to noisy or incomplete data,
incorporating advanced data quality assurance
techniques. In addition, we will also validate this
procedure using other critical infrastructure datasets,
such as power grids, to extend its applicability. The
objective is to study whether there is similarity in the
behaviour of the hyperparameters. In the medium
term, the aim is to incorporate techniques that detect
the ideal moment to recalculate the hyperparameters,
for example by detecting a degradation in accuracy
due to changes in data trends.
ACKNOWLEDGEMENTS
This work has been supported by Project PID2023-
152566OB-I00 "Preventive Maintenance of digital
infrastructures through the application of Artificial
Intelligence for the diagnosis and prediction of
anomalies (PreMAI)" funded by MICIU/AEI
/10.13039/501100011033 and by ERDF, EU; project
UAIND22-01B - "Adaptive control of urban supply
systems" of the Vice-rectorate for Research of the
University of Alicante; and project co-financed by the
Valencian Institute for Competitiveness and
Innovation (IVACE+i) and is eligible for co-financing
by the European Union (Exp. INNTA3/ 2022/3).
REFERENCES
Bischl, B., Binder, M., Lang, M., Pielok, T., Richter, J.,
Coors, S., ... & Lindauer, M. (2023). Hyperparameter
optimization: Foundations, algorithms, best practices,
and open challenges. Wiley Interdisciplinary Reviews:
Data Mining and Knowledge Discovery, 13(2), e1484.
Dhouib, H., Wilms, A., & Boes, P. (2023). Distribution and
volume based scoring for Isolation Forests. arXiv
preprint arXiv:2309.11450.
Hannák, G., Horváth, G., Kádár, A., & Szalai, M. D. (2023).
"Bilateral-Weighted Online Adaptive Isolation Forest
for Anomaly Detection in Streaming Data." Statistical
Analysis and Data Mining: The ASA Data Science
Journal, 16(3), 215-223.
Hariri, S., Kind, M. C., & Brunner, R. J. (2019). "Extended
Isolation Forest." IEEE Transactions on Knowledge
and Data Engineering, 33(4), 1479-1489.
Karczmarek, P., Kiersztyn, A., Pedrycz, W., & Al, E.
(2020). "K-Means-Based Isolation Forest."
Knowledge-Based Systems, 195, 105659.
Lee, C. H., Lu, X., Lin, X., Tao, H., Xue, Y., & Wu, C.
(2020, April). Anomaly detection of storage battery
based on isolation forest and hyperparameter tuning. In
Proceedings of the 2020 5th I.Conference on
Mathematics and Artificial Intelligence (pp. 229-233).
Liu, F. T., Ting, K. M., & Zhou, Z. H. (2008). "Isolation
Forest." In 2008 Eighth IEEE International Conference
on Data Mining (pp. 413-422). IEEE.
Liu, F. T., Ting, K. M., & Zhou, Z. H. (2010). "On
Detecting Clustered Anomalies Using SCiForest." In J.
European Conf. on Machine Learning and Knowledge
Discovery in Databases (pp. 274-290). Springer.
Nalini, M. yamini, B., Ambhika, C., & Siva Subramanian,
R. (2024). "Enhancing Early Attack Detection: Novel
Hybrid Density-Based Isolation Forest for Improved
Anomaly Detection." International Journal of Machine
Learning and Cybernetics, 1-19.
Priyanto, C. Y., & Purnomo, H. D. (2021, September).
Combination of isolation forest and LSTM autoencoder
for anomaly detection. In 2021 2nd International
Conference on Innovative and Creative Information
Technology (ICITech) (pp. 35-38). IEEE.
Xu, H., Pang, G., Wang y., & Wang y. (2023). "Deep
Isolation Forest for Anomaly Detection." IEEE
Transactions on Knowledge and Data Engineering,
35(12), 12591-12604.