Active Directory Kerberoasting Attack: Detection using Machine

Learning Techniques

Luk

s Kotlaba, Simona Buchoveck

a and R

obert L

orencz

Department of Information Security, Faculty of Information Technology, Czech Technical University in Prague,

Czech Republic

Keywords:

MS Active Directory, Machine Learning, Kerberoasting, Attack Detection, Cybersecurity.

Abstract:

Active Directory is a prevalent technology used for managing identities in modern enterprises. As a variety

of attacks exist against Active Directory environment, its security monitoring is crucial. This paper focuses

on detection of one particular attack - Kerberoasting. The purpose of this attack is to gain access to service

accounts’ credentials without the need for elevated access rights. The attack is nowadays typically detected

using traditional ”signature-based” detection approaches. Those, however, often result in a high number of

false alerts. In this paper, we adopt machine learning techniques, particularly several anomaly detection al-

gorithms, for detection of Kerberoasting. The algorithms are evaluated on data from a real Active Directory

environment and compared to the traditional detection approach, with a focus on reducing the number of false

alerts.

1 INTRODUCTION

Active Directory (AD) is a directory service based

on Lightweight Directory Access Protocol (LDAP).

It stores information about objects on a domain net-

work, such as users, groups, computers, and many

others, together with their attributes in a hierarchi-

cal structure. It is typically used for a broad range

of identity-related services in a domain; basic exam-

ples are authentication, authorization, and accounting

(AAA) services for users and computers. (Desmond

et al., 2013)

Although AD is a proprietary service developed

by Microsoft, it represents an essential part of enter-

prise networks, integrating both Windows and non-

Windows devices. Nowadays, the usage of AD tech-

nology spans even further, to hybrid and public cloud

environments with products such as Azure AD or

AWS Directory Service.

Understanding what data AD stores, and consid-

ering its signiﬁcance, it is not surprising that it repre-

sents an interesting target for cyber attackers. Once

adversaries get access to AD, the consequences can

be fatal, often resulting in full domain compromise.

Our research focuses on Kerberoasting attack. In

this attack, adversaries target service accounts with

the aim to crack their passwords. If successful, the ob-

tained credentials may enable privilege escalation and

further lateral movement to other domain accounts.

In response, organizations put a lot of effort into

securing their AD environments and developing coun-

termeasures. Besides preventive actions in the do-

main administration, real-time security monitoring

and attack detection are the essential defense mech-

anisms against AD threats.

Traditional detection of AD attacks, including

Kerberoasting, is based on a rule-based analysis of

relevant data. Detection rules contain speciﬁc con-

ditions that are checked against Windows logs col-

lected from machines over the network. If the log data

matches the deﬁned conditions, the rule is triggered,

and a security alert is generated.

However, in practice, it turns out that rule-based

detection has its limitations and is not always sufﬁ-

cient. Firstly, static rules may have high false alarm

ratio (FAR). The rules should have the ability to catch

malicious activity even if the attackers use legitimate

computers and user accounts to perform their actions.

On the other hand, such rules may trigger on regu-

larly occurring events with these accounts, e.g., login

failures due to wrong passwords or actions carried by

administrators. A high number of false alerts may re-

sult in a real alert being overlooked or ignored.

Rule-based detection may also have other down-

sides. Since the rules are deﬁned statically, they of-

ten contain various manually deﬁned thresholds, con-

376

Kotlaba, L., Buchovecká, S. and Lórencz, R.

Active Directory Kerberoasting Attack: Detection using Machine Learning Techniques.

DOI: 10.5220/0010202803760383

In Proceedings of the 7th International Conference on Information Systems Security and Privacy (ICISSP 2021), pages 376-383

ISBN: 978-989-758-491-6

stants, or signatures. These are difﬁcult to determine

in the creation process, but on the other hand, rela-

tively easy for an attacker to overcome.

Adversaries may modify their attack procedure to

evade detection, such as limit the number of attempts

to ﬁt under a numeric threshold used in the detection

rule, change various values in the attack tools to avoid

signature detection, or spread the attack over a more

extended time period to limit the rate of attempts per

time unit.

In our previous research (Kotlaba et al., 2020), we

developed a set of signature-based rules that can be

used to detect Kerberoasting. We found out that the

FAR of the designed rules may vary, especially in di-

verse environments with many users. Non-standard

approaches that used honeypots or PowerShell mon-

itoring for detection, performed better compared to

the rules based only on the number of service ticket

requests. However, those carry on implementation

overhead and therefore may not be suitable for usage

in every environment.

In this paper, we propose applying machine learn-

ing (ML) techniques for detecting Kerberoasting. ML

may improve existing signature-based methods, help

to overcome their limitations, improve detection accu-

racy by reducing FAR, and further enhance detection

capabilities.

Multiple anomaly detection algorithms are dis-

cussed, especially in terms of their suitability and efﬁ-

ciency for detection of Kerberoasting. Several conﬁg-

urations of ML models are compared. The evaluation

is done using log data originating from an AD envi-

ronment of a real organization, which is diverse in its

structure and contains hundreds of daily users.

The paper is structured as follows: Section 2 con-

tains background information on Kerberoasting at-

tack, followed by a discussion about the possibilities

of its detection. Further, details of the used ML al-

gorithms are provided. Section 3 is dedicated to the

existing research related to the topic of AD attack de-

tection, as well as applications of ML techniques in

cybersecurity. Our proposed detection approach is

presented in Section 4, together with a summary of

achieved results. Section 5 concludes the paper.

2 BACKGROUND

2.1 Kerberoasting Attack

Kerberoasting attack was ﬁrst presented by Tim

Medin to attack the credentials of a remote ser-

vice without sending any trafﬁc to the service itself.

(Medin, 2014)

In Mitre ATT&CK framework, Kerberoasting is

listed as a sub-technique of Steal or Forge Kerberos

Tickets technique, which falls under Credential Ac-

cess tactics. The attack differs from the other two sub-

techniques (Golden and Silver Ticket) in the level of

permissions required. Kerberoasting can be executed

without the need for a local administrator account or

an account having higher permissions in the domain.

A valid domain account or the ability to sniff trafﬁc

within a domain is enough for an attacker to carry on

Kerberoasting. (The Mitre Corporation, 2020)

Kerberoasting takes advantage of how service ac-

counts leverage Kerberos authentication. Kerberos

is the default protocol used within an AD domain.

It allows users to access services on the network by

merely using tickets (instead of passwords), generated

for every session and used over a short time period.

Users can access remote services by simply re-

questing a service ticket from a domain controller

(DC), which serves as the key distribution center

(KDC) in the AD implementation of Kerberos. When

clients request service tickets for given services from

a DC, they use unique identiﬁers called service prin-

cipal names (SPNs). To enable Kerberos authentica-

tion, it is required that SPNs are registered in AD with

at least one service logon account (an account specif-

ically dedicated to run the service). (The Mitre Cor-

poration, 2020)

The course of Kerberoasting attack can be sum-

marized in the following steps:

1. Services are identiﬁed by their SPNs registered

in AD. An attacker that controls a valid user ac-

count may obtain these identiﬁers by an enumera-

tion technique called SPN Scanning.

2. The attacker requests a service ticket by specify-

ing the SPN value to a DC. In response, the DC

presents the service ticket, part of which is en-

crypted with the target service account’s password

hash. In case a weak cipher suite is negotiated,

such as RC4-HMAC-MD5, the service account’s

NT password hash is used to encrypt the service

ticket. There is no communication with the sys-

tem hosting the target service.

3. NT password hash is prone to brute-force pass-

word cracking. The cracking can be done ofﬂine,

without any communication to the target service

or the DC, meaning no events indicating failed lo-

gins are recorded. If the ticket is exported from

memory, the cracking can be done on a different

machine, outside of the target domain.

4. If the service account’s password was not com-

plex enough, the attacker would recover it in

plaintext.

Active Directory Kerberoasting Attack: Detection using Machine Learning Techniques

377

In order for Kerberoasting attack to be success-

ful, it is enough that service accounts are set-up

with a weak password (typically not long enough or

dictionary-based). The attack also requires older en-

cryption types to be enabled. That is, however, the

default behavior, and often a necessity for backward

compatibility. Therefore, the Kerberoasting attack re-

quirements may be, unfortunately, easily fulﬁlled by

underrated administration of service accounts in or-

ganizations. Furthermore, if these accounts are over-

permissioned, an attacker controlling such an account

may be able to access sensitive data or simply pivot

through the domain.

There are plenty of tools available to perform Ker-

beroasting beyond proof of concept. Furthermore,

SPN Scanning and subsequent requesting of service

tickets can be also performed using PowerShell com-

mands only. (Metcalf, 2017b)

Kerberoasting technique uses Kerberos authenti-

cation as designed and intended, without exploiting

any part of the process. This makes the attack chal-

lenging to detect because service tickets are requested

regularly as users access resources in the domain. To

detect Kerberoasting, it is necessary to distinguish be-

tween legitimate and malicious requests.

Kerberos service ticket requests are logged in

Windows Event ID 4769 A Kerberos service ticket

was requested (both for success and failure), which

is generated every time DC receives a service ticket

request. Valuable information contained in this event

includes account name, target service, source IP ad-

dress, encryption type used, and failure code. (Mi-

crosoft Corporation, 2017)

There is no default way of detecting Kerberoast-

ing, and thus building custom detection rules is neces-

sary. As proposed by (Metcalf, 2017b), the detection

logic can be built on top of the event 4769, using the

following indicators:

• excessive requests for different services with a

small-time difference from the same user,

• Kerberos service ticket requests with a weak en-

cryption (such as RC4).

We utilized these indicators in our previous re-

search (Kotlaba et al., 2020), where we developed

several static rules applicable for detection of Ker-

beroasting. The rules were set to trigger an alert if

there was higher number of service ticket requests

with weak encryption types from a particular source

than a speciﬁc predeﬁned threshold.

However, these rules require that the threshold

would be set to a reasonable value, which may be

challenging to determine. The rules do not contain

any built-in capability of reﬂecting normal vs. abnor-

mal number of requests in the domain.

To overcome these limitations, we propose utiliz-

ing machine learning in addition to the used detection

logic with the aim to gain the self-adopting capability

and improve detection accuracy.

2.2 Machine Learning Algorithms

Detection of Kerberoasting technique represents a

problem of differentiating between normal and un-

usual behavior in terms of service ticket requests. In

our research, we have focused on the possibility of

solving this problem by applying machine learning.

Given the nature of our problem, we utilized ML al-

gorithms suitable for anomaly detection.

Anomaly detection is intended for detecting ab-

normal or unusual observations in the data. As de-

scribed in (Pedregosa et al., 2020), we may further

distinguish between two slightly different types of

anomaly detection:

• outlier detection - unsupervised anomaly detec-

tion, where the training data contains outliers

which are deﬁned as observations that are far from

the others,

• novelty detection - semi-supervised anomaly de-

tection, where the training data is not polluted by

outliers, and we want to decide whether a new ob-

servation is an outlier.

In our approach, we have chosen one representa-

tive algorithm of each kind, in particular One-Class

Support Vector Machine (SVM) for novelty detection,

and Local Outlier Factor (LOF) for outlier detection.

2.2.1 One-Class SVM

One-Class SVM is an algorithm introduced by

(Sch

olkopf et al., 1999), developed as an extension

of SVM for the purpose of novelty detection. SVM is

a method that performs classiﬁcation by constructing

hyperplanes in a multidimensional space that separate

points of different classes. One-class classiﬁcation

differs from standard classiﬁcation in the sense that

it learns data from one class only - the normal class,

while there are no or few samples from the other class

- the abnormal class.

To identify suspicious observations, the algorithm

estimates a distribution that encompasses the major-

ity of the data and then labels the data points that lie

far from it as suspicious, with respect to a suitable

metric. The solution is built by estimating a probabil-

ity distribution function which makes most of the ob-

served data more likely than the rest, and a decision

rule that separates these observations by the largest

possible margin. If the new data points lay outside

ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy

378

of the determined margin, we can say that they are

abnormal with given conﬁdence in our assessment.

One-Class SVM can be used with multiple param-

eters, out of which the most important is the choice of

a kernel and a scalar parameter ν. Usually, the ra-

dial basis function (RBF) kernel is chosen due to its

non-linear properties. The ν parameter, also known

as the margin of the One-Class SVM, corresponds to

the probability of ﬁnding a new observation outside

the frontier, which can also be interpreted as the pro-

portion of outliers we expect to ﬁnd in our data. (Pe-

dregosa et al., 2020)

2.2.2 Local Outlier Factor

Local Outlier Factor (LOF) is an unsupervised algo-

rithm proposed by (Breunig et al., 2000), which can

be used to ﬁnd outliers in a given dataset. The sep-

aration is made based on a score that represents the

likelihood of a particular data point being an outlier.

LOF is based on the concept of local density. The

density is obtained for every data point by estimat-

ing its distance from k-nearest neighbors, where the

distance is computed according to the chosen metric.

A normal instance is expected to have a local density

similar to that of its neighbors, while an abnormal is

expected to have much lower density. Thus, the main

idea is to detect the samples that have a signiﬁcantly

lower density than their neighbors. (Pedregosa et al.,

2020)

The advantage of LOF is that due to the local ap-

proach, it is able to identify outliers that would not

be identiﬁed in a global perspective. The focus on

locality can be directly inﬂuenced by setting the num-

ber of neighbors for the LOF calculation. By setting a

small value, LOF considers only nearby points, which

may be prone to errors when having much noise in the

data. On the other hand, setting a large number may

cause local outliers to be missed.

3 RELATED WORK

There are multiple publications and existing research

that deals with detecting attacks against AD or Win-

dows environment. The majority focuses on elements

such as signatures and characteristics of attacks that

can be used to build detection rules.

(Metcalf, 2017a) mentions various artifacts and

signatures that can be found in Windows Security

events and are usable for detection of multiple AD at-

tack scenarios, especially attacks related to Kerberos

protocol, including Kerberoasting.

CERT-EU published whitepaper Kerberos Golden

Ticket Protection (Soria-Machado et al., 2016), where

detection of Golden Ticket is proposed based on mon-

itoring speciﬁc string signatures in the events and the

lifetime of Kerberos tickets.

However, a common disadvantage of the men-

tioned approaches is that the attacks would remain

undetected if attackers were able to evade the used

signatures, e.g., by changing the tools’ ﬁlenames or

timing of the attack. Furthermore, false positives can

occur if legitimate administrators use commands that

match the deﬁned signatures during their daily opera-

tions.

Apart from detection methods based on static

rules and signatures, there is research available at-

tempting to utilize machine learning to discover

anomalous behavior in Windows environments and

detect attack techniques related to lateral movement

in AD infrastructure.

(Hsieh et al., 2015) propose a threat detection

method built on accounts’ behavior sequences. For

each account, the probability model based on Markov

chains is built and used to detect abnormal behavior.

By testing this approach, it was able to give the best

performance of about 66.6% recall and 99.0% preci-

sion rates when combined with prior knowledge.

(Goldstein et al., 2013) suggest monitoring abnor-

mal user behavior by utilizing unsupervised learning,

in particular with the k-NN algorithm. The algorithm

was used to ﬁnd anomalies in Windows authentica-

tion logs. Besides misconﬁgurations, it was able to

ﬁnd some interesting insights about the infrastructure.

(Matsuda et al., 2018) propose a method for out-

lier detection using unsupervised learning with the

One-Class SVM algorithm. The detection is based on

Windows Events 4674 An operation was attempted

on a privileged object and 4688 A new process has

been created, and focused on detecting attacks that

require Domain Administrator privilege. The chosen

approach showed high recall and precision ratios.

(Uppstr

omer and R

aberg, 2019) utilized a super-

vised machine learning approach for detection of lat-

eral movement. Attack activities related to lateral

movement included AD enumeration, Pass the Hash

and Pass the Ticket attacks. Several classiﬁers were

compared on a semi-synthetic dataset. In the men-

tioned experiments, high accuracy was observed with

all tested classiﬁers, but they performed differently in

terms of recall and precision.

Anomaly-based methods based on clustering and

principal component analysis were applied and com-

pared by (Meijerink, 2019) for detection of lateral

movement in Windows environments. The results

showed that clustering generally performed better, but

Active Directory Kerberoasting Attack: Detection using Machine Learning Techniques

379

with a relatively high false-positive ratio, which fur-

ther reduction is desired.

To our best knowledge, there is currently no re-

lated research published that would study the possi-

bilities of adopting machine learning techniques for

detection of Kerberoasting attack.

4 PROPOSED APPROACH

Our research aims to utilize machine learning for de-

tecting Kerberoasting attack. Our idea is to apply al-

gorithms suitable for anomaly detection to identify

unusual patterns, especially a higher number of ticket

requests for non-typical services from sources that

usually do not perform this kind of activity. If identi-

ﬁed, such activity is worth investigating for the possi-

bility of Kerberoasting attack execution. To validate

our approach, we compared Kerberoasting detection

based on static rules to several models based on ML

techniques on real log data.

4.1 Implementation

4.1.1 Technology

For practical implementation, we have chosen Splunk

platform, as it offers capabilities for both developing

static detection rules and using ML techniques. Also,

Splunk is commonly used by organizations for secu-

rity monitoring purposes, which might help to apply

our results in real environments easily.

Splunk is a platform for analyzing machine data.

From a technical perspective, Splunk ensures collect-

ing, parsing, and indexing Windows Events and other

log sources. Ingested data can be queried by using its

proprietary Search Processing Language (SPL) syn-

tax. (Splunk Inc., 2020)

ML support is provided by Splunk Machine

Learning Toolkit add-on, which provides an inter-

face for using ML algorithms from the Python library

Scikit-learn (Pedregosa et al., 2011), directly within

the Splunk platform. Together with the underlying

Scikit-learn library, the add-on supports both One-

Class SVM and LOF algorithms that we have chosen

to implement our approach.

4.1.2 Dataset Preparation

The dataset used for the experiments is non-synthetic,

originating from an AD environment of a real orga-

nization with hundreds of daily users. It consists of

Windows Events 4769 collected over the time span of

six weeks.

As the event 4769 belongs to one of the most

numerous events logged in a Windows domain, the

dataset was further ﬁltered. Only ticket requests with

weak encryption types were preserved. Further, re-

quests made from a domain controller itself, or re-

quests for service names ending with a dollar sign

(that identify computer accounts) were ﬁltered out.

Filtered data were then split into bins, where a

single bin represents a time frame of one day. Then

statistics were calculated across the events by day, IP

address, and source username. As a result, one data

point in the dataset represents cumulative statistics of

ticket requests per unique combination of username

and IP address, per one day. Having aggregated statis-

tics per day is more suitable, as the amount of activity

differs signiﬁcantly over the day (business vs. out-of-

business hours).

For the purpose of applying ML techniques, the

dataset was split into training, validation, and testing

dataset based on time as follows:

• training – 4 weeks, 6265 events, 0 malicious,

• validation – 1 week, 1601 events, 1 malicious,

• testing – 1 week, 1489 events, 3 malicious.

Training data contains only regular trafﬁc from the

environment. A few crafted malicious events, which

are to represent Kerberoasting activity, were injected

into validation and testing datasets. Those events

were designed to look very similar to regular events,

each with a different number of services requested -

both higher and lower values were used, compared to

the numbers observed in the regular events.

4.1.3 Feature Engineering

Since detection of Kerberoasting is based solely on

Event 4679, the number of possible features usable

for ML models is limited by the ﬁelds contained in

this event. The most important ﬁelds for detection of

Kerberoasting identify the source (Account Name and

Client Address) and the service for which a ticket was

requested (Service Name/ID).

Extracting features from Windows events is prob-

lematic due to the small number of available ﬁelds,

and the categorical nature of their values. That means

the majority of ﬁelds cannot be used directly as fea-

tures in most ML models, as those usually require nu-

meric features.

Account Name values cannot be directly con-

verted to a numeric equivalent. Furthermore, this ﬁeld

may theoretically contain an unlimited number of dif-

ferent values, thus encoding is not the option either.

In environments where users choose their own user-

names, we may attempt to obtain numeric or statisti-

cal values, such as username length. On the contrary,

ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy

380

in environments where standardized naming conven-

tion is in place, the account’s properties may be ex-

tracted, such as the type of the account, i.e., personal,

non-personal, or system account.

Service Name ﬁeld is quite similar to Account

Name. Depending on the naming convention used for

service identiﬁers, features identifying service prop-

erties or its purpose may be extracted.

Client Address ﬁeld typically contains an IPv4 ad-

dress. Although IP addresses seem to be numeric,

they cannot be treated as such. An IP address rep-

resented as a 32-bit binary or decimal number does

not represent an ordinal value.

Common approach of representing IP addresses is

obtaining location information and representing it as

GPS coordinates. However, this is not applicable in

our scenario, as we are dealing with private IP ad-

dresses. In environments with strict network segmen-

tation, where different IP subnets are assigned to dif-

ferent network segments, IP addresses can be labeled

based on their segment, which may reveal valuable

information about the type of source computer.

Apart from using existing ﬁelds, several numeric

ﬁelds can be calculated. An example would be the

number of services requested for a speciﬁc source,

which is the most important indicator for Kerberoast-

ing detection. Moreover, the number may be split into

several numeric features, representing a count of ser-

vice tickets per service type, if we were able to distin-

guish service types based on their identiﬁers.

Based on the properties of our dataset and the dis-

cussion above, we selected the following features for

use with ML models in our experiments:

• number of requests for distinct services from a

particular source,

• number of requests for distinct services from a

particular source split by service type (applica-

tion, database, etc.),

• type of source account requesting the ticket (per-

sonal, non-personal, system account, etc.).

4.2 Experiments and Results

Experiments were conducted with the goal to deter-

mine the best combination of features and values of

hyperparameters for the ML models that can be used

to solve the given problem. To achieve that, the fol-

lowing methodology was used:

1. ML algorithms were ﬁtted on training data with

different values of hyperparameters. Their inﬂu-

ence on the output was measured on the valida-

tion dataset. Appropriate values of hyperparame-

ters were selected for every algorithm.

2. Models with the selected conﬁguration of hyper-

parameters were applied to the dataset using dif-

ferent combinations of features. The best combi-

nation of features was chosen for every algorithm.

3. The algorithms were evaluated and compared on

the testing dataset, using the best-determined con-

ﬁguration for every algorithm.

4.2.1 Static Threshold Rule

Kerberoasting can be detected by a static rule contain-

ing a numeric threshold for the number of requested

service tickets, as proposed in (Kotlaba et al., 2020).

The key is to deﬁne the right threshold, as this value

greatly inﬂuences the number of false alerts, as well

as the detection sensitivity of the rule.

Based on the distribution of the number of re-

quested services in the training dataset, the threshold

value for service count was set to 15. This means that

an alert would be generated for every data point with

a higher number of service requests. Table 1 summa-

rizes the results of the rule applied to the datasets.

Table 1: Detection results for the static rule.

Dataset Training Test

TP 0.0% 0.1%

TN 98.2% 97.7%

FP 1.8% 2.1%

FN 0.0% 0.1%

4.2.2 One-Class SVM

As described in Section 2, One-Class SVM requires

choice of a kernel and the ν parameter. Our mea-

surements with different kernels unambiguously con-

ﬁrmed the advantages of the RBF kernel. The γ pa-

rameter, which is a coefﬁcient for the RBF kernel,

seems to be greatly inﬂuencing the number of FP re-

sults.

For the experiment with different feature combi-

nations, the ν parameter was set to the value of 0.001,

as we expect approximately 0.1% of malicious events

in the validation dataset. The γ parameter of the RBF

kernel was adjusted experimentally for every model

due to its sensitivity. Other parameters were kept at

their default values.

The use of different feature combinations greatly

inﬂuences the results of One-Class SVM. The pri-

mary indicator - the number of requested services was

present in all the combinations. Adding one more fea-

ture, such as service count by type, or the type of user,

helped to reduce the number of false positives com-

pared to the model ﬁtted only on aggregated service

count.

Active Directory Kerberoasting Attack: Detection using Machine Learning Techniques

381

Table 2: Comparison of the approaches for Kerberoasting detection.

Static rule One-Class SVM Local Outlier Factor

TP 2 0.1% 3 0.2% 1 0.1%

TN 1455 97.7% 1452 97.5% 1457 97.8%

FP 31 2.1% 34 2.3% 29 2.0%

FN 1 0.1% 0 0.0% 2 0.1%

On the other hand, when the service count was

used as scaled, its importance among other features

was not captured, and the model performed worse.

The best results achieved the model using features

representing both source user and target service types.

4.2.3 Local Outlier Factor

LOF is an unsupervised algorithm, and as imple-

mented in Splunk, it cannot be ﬁtted on training data

and saved as a model. It is meant to be applied di-

rectly with a particular conﬁguration of hyperparam-

eters to a dataset that is to be scanned for outliers.

The output of LOF varies signiﬁcantly for differ-

ent counts of neighbors considered. Too big focus on

locality was notable for values lower than 20, but on

the other hand, there was almost no difference in FP

detections on a scale from 20 neighbors to all points

in the dataset. For the experiments with different fea-

ture combinations, the number of considered neigh-

bors was set to all data points.

Another parameter inﬂuencing the output is the

expected contamination of the dataset. This parame-

ter seems to have linear impact on the number of FPs.

However, when set to a value too low, the malicious

event was not evaluated as an outlier.

In further experiments, we were changing the way

local density is computed. However, the parameters

as metric, algorithm, or even leaf size had little impact

on the output of LOF models.

Surprisingly, results of LOF showed almost no

sensitivity to different combinations of features used.

We measured consistent numbers of false positives

among the output of different conﬁgurations. Due

to this property, the same features as with One-Class

SVM were chosen for the comparison.

4.2.4 Comparison

The models with the best-determined conﬁgurations

and the static threshold rule were all applied to the

testing dataset. The comparison of the results is pre-

sented in Table 2.

In the static rule, the numeric threshold for the

number of requested services was set to a relatively

high value (15) to reduce FP detections. However, this

resulted in one malicious event not being detected,

as the number of requested services in the event was

lower than the threshold set in the rule.

One-Class SVM model was able to detect all the

three malicious events, but with a slightly higher num-

ber of false-positive detections. However, if we were

to use the static rule with a lower threshold set - to

be able to detect all three malicious events - the num-

ber of false-positive detections would be signiﬁcantly

higher than with the One-Class SVM model. This

is illustrated in the Table 3, which demonstrates the

comparison of results between the static rule with the

threshold value set to 5 and One-Class SVM.

Table 3: Use of the static rule with a lower threshold.

Static rule One-Class SVM

TP 3 0.2% 3 0.2%

TN 1342 90.1% 1452 97.5%

FP 144 9.7% 34 2.3%

FN 0 0% 0 0%

Local Outlier Factor in the used conﬁguration

yielded the least number of false positives, but on the

other hand, it also missed two malicious events. This

could be avoided by increasing the value of the con-

tamination parameter, but in that case an increase in

the number of FP events is also expected.

5 CONCLUSIONS

In this paper, we proposed adopting machine learn-

ing techniques to improve detection of Kerberoasting

attack. Adversaries use this attack to target Active

Directory environments, with the goal of extracting

credentials of service accounts. Detection of Ker-

beroasting relies traditionally on custom signature-

based rules that may not be effective in diverse en-

vironments.

We utilized anomaly detection algorithms to iden-

tify unusual patterns in authentication to remote ser-

vices that may indicate the possibility of Kerberoast-

ing attack execution. The algorithms were compared

and evaluated on log data originating from a real orga-

nization. The solutions were implemented in widely-

used Splunk technology, making them applicable in

practice easily.

ICISSP 2021 - 7th International Conference on Information Systems Security and Privacy

382

We conducted extensive experiments with One-

Class SVM and Local Outlier Factor algorithms, us-

ing different conﬁgurations of their hyperparameters

and features extracted from Windows Events. As the

results have shown, ML approach signiﬁcantly re-

duced the number of false-positive detections com-

pared to the signature-based approach. At the same

time, we did not observe an increase in the number of

false negatives.

Furthermore, at the same percentage level of false

positives, the One-Class SVM algorithm helped im-

prove detection capabilities, as it detected one attack

that was missed by the static rule. LOF algorithm has

not proven to be equally effective but offers the ad-

vantage of no need for training data.

ACKNOWLEDGEMENTS

This work was supported by the Student Summer

Research Program 2020 of FIT CTU in Prague and

the grant no. SGS20/212/OHK3/3T/18. The au-

thors acknowledge the support of the OP VVV MEYS

funded project CZ.02.1.01/0.0/0.0/16 019/0000765

”Research Center for Informatics”.

REFERENCES

Breunig, M., Kriegel, H.-P., Ng, R., and Sander, J. (2000).

LOF: Identifying density-based local outliers. In ACM

Sigmod Record, volume 29, pages 93–104.

Desmond, B., Richards, J., Allen, R., and Lowe-Norris,

A. G. (2013). Active Directory: Designing, Deploy-

ing, and Running Active Directory, chapter 1-2, 9-10.

O’Reilly Media, 5 edition.

Goldstein, M., Asanger, S., Reif, M., and Hutchison,

A. (2013). Enhancing Security Event Management

Systems with Unsupervised Anomaly Detection. In

ICPRAM, pages 530–538.

Hsieh, C., Lai, C., Mao, C., Kao, T., and Lee, K. (2015).

AD2: Anomaly detection on active directory log

data for insider threat monitoring. In 2015 Interna-

tional Carnahan Conference on Security Technology

(ICCST), pages 287–292.

Kotlaba, L., Buchoveck

a, S., and L

orencz, R. (2020). Ac-

tive Directory Kerberoasting Attack: Monitoring and

Detection Techniques. In Proceedings of the 6th Inter-

national Conference on Information Systems Security

and Privacy, ICISSP 2020, Valletta, Malta, February

25-27, 2020, pages 432–439. SCITEPRESS.

Matsuda, W., Fujimoto, M., and Mitsunaga, T. (2018). De-

tecting APT attacks against Active Directory using

Machine Leaning. In 2018 IEEE Conference on Ap-

plication, Information and Network Security (AINS),

page 60–65.

Medin, T. (2014). Attacking Microsoft Kerberos Kicking

the Guard Dog of Hades. In DerbyCon 4.0, Louisville,

USA.

Meijerink, M. (2019). Anomaly-based Detection of Lat-

eral Movement in a Microsoft Windows Environ-

ment. Master’s thesis, Faculty of Electrical Engineer-

ing, Mathematics & Computer Science, University of

Twente, 7500 AE Enschede, The Netherlands.

Metcalf, S. (2017a). Active Directory Security. https:

//adsecurity.org/. [Online; accessed on 14-September-

2020].

Metcalf, S. (2017b). Active Directory Security: Detect-

ing Kerberoasting Activity. https://adsecurity.org/?p=

3458. [Online; accessed on 14-September-2020].

Microsoft Corporation (2017). Microsoft Docs: Se-

curity auditing: 4769(S, F): A Kerberos service

ticket was requested. https://docs.microsoft.com/

en-us/windows/security/threat-protection/auditing/

event-4769. [Online; accessed on 14-September-

2020].

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,

Thirion, B., Grisel, O., Blondel, M., Prettenhofer,

P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,

A., Cournapeau, D., Brucher, M., Perrot, M., and

Duchesnay, E. (2011). Scikit-learn: Machine learning

in Python. Journal of Machine Learning Research,

12:2825–2830.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,

Thirion, B., Grisel, O., Blondel, M., Prettenhofer,

P., Weiss, R., Dubourg, V., Vanderplas, J., Pas-

sos, A., Cournapeau, D., Brucher, M., Perrot, M.,

and Duchesnay, E. (2020). Scikit-learn: Novelty

and outlier detection. https://scikit-learn.org/stable/

modules/outlier detection.html. [Online; accessed on

14-September-2020].

Sch

olkopf, B., Williamson, R., Smola, A., Shawe-Taylor, J.,

and Platt, J. (1999). Support vector method for novelty

detection. In NIPS, volume 12, pages 582–588.

Soria-Machado, M., Abolins, D., Boldea, C., and

Socha, K. (2016). Kerberos Golden Ticket

Protection. Technical report, CERT-EU Secu-

rity Whitepaper 2014–007. Also available at

http://cert.europa.eu/static/WhitePapers/UPDATED%

20-%20CERT-EU

Security Whitepaper 2014-007

Kerberos Golden Ticket Protection v1 4.pdf [On-

line; accessed on 14-September-2020].

Splunk Inc. (2020). Splunk Documentation. https://docs.

splunk.com/Documentation. [Online; accessed on 14-

September-2020].

The Mitre Corporation (2020). Steal or Forge Ker-

beros Tickets: Kerberoasting. https://attack.mitre.

org/techniques/T1558/003/. [Online; accessed on 14-

September-2020].

Uppstr

omer, V. and R

aberg, H. (2019). Detecting Lateral

Movement in Microsoft Active Directory Log Files.

Master’s thesis, Faculty of Computing, Blekinge In-

stitute of Technology, 371 79 Karlskrona, Sweden.

Active Directory Kerberoasting Attack: Detection using Machine Learning Techniques

383