Explainable Outlier Detection Using Feature Ranking for k-Nearest Neighbors, Gaussian Mixture Model and Autoencoders

Lucas Krenmayr, Markus Goldstein

2023

Abstract

Outlier detection is the process of detecting individual data points that deviate markedly from the majority of the data. Typical applications include intrusion detection and fraud detection. In comparison to the well-known classification tasks in machine learning, commonly unsupervised learning techniques with unlabeled data are used in outlier detection. Recent algorithms mainly focus on detecting the outliers, but do not provide any insights what caused the outlierness. Therefore, this paper presents two model-dependent approaches to provide explainability in multivariate outlier detection using feature ranking. The approaches are based on the k-nearest neighbors and Gaussian Mixture Model algorithm. In addition, these approaches are compared to an existing method based on an autoencoder neural network. For a qualitative evaluation and to illustrate the strengths and weaknesses of each method, they are applied to one synthetically generated and two real-world data sets. The results show that all methods can identify the most relevant features in synthetic and real-world data. It is also found that the explainability depends on the model being used: The Gaussian Mixture Model shows its strength in explaining outliers caused by not following feature correlations. The k-nearest neighbors and autoencoder approaches are more general and suitable for data that does not follow a Gaussian distribution.

Download


Paper Citation


in Harvard Style

Krenmayr L. and Goldstein M. (2023). Explainable Outlier Detection Using Feature Ranking for k-Nearest Neighbors, Gaussian Mixture Model and Autoencoders. In Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-626-2, pages 245-253. DOI: 10.5220/0011631900003411


in Bibtex Style

@conference{icpram23,
author={Lucas Krenmayr and Markus Goldstein},
title={Explainable Outlier Detection Using Feature Ranking for k-Nearest Neighbors, Gaussian Mixture Model and Autoencoders},
booktitle={Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2023},
pages={245-253},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011631900003411},
isbn={978-989-758-626-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Explainable Outlier Detection Using Feature Ranking for k-Nearest Neighbors, Gaussian Mixture Model and Autoencoders
SN - 978-989-758-626-2
AU - Krenmayr L.
AU - Goldstein M.
PY - 2023
SP - 245
EP - 253
DO - 10.5220/0011631900003411