Investigating the Suitability of Concept Drift Detection for Detecting

Leakages in Water Distribution Networks

Valerie Vaquet*

, Fabian Hinder*

and Barbara Hammer

Machine Learning Group, Bielefeld University, Germany

Keywords:

Concept Drift Detection, Water Distribution Networks, Anomaly Detection, Leakage Detection.

Abstract:

Leakages are a major risk in water distribution networks as they cause water loss and increase contamination

risks. Leakage detection is a difﬁcult task due to the complex dynamics of water distribution networks. In par-

ticular, small leakages are hard to detect. From a machine-learning perspective, leakages can be modeled as

concept drift. Thus, a wide variety of drift detection schemes seems to be a suitable choice for detecting leak-

ages. In this work, we explore the potential of model-loss-based and distribution-based drift detection methods

to tackle leakage detection. We additionally discuss the issue of temporal dependencies in the data and propose

a way to cope with it when applying distribution-based detection. We evaluate different methods systemati-

cally for leakages of different sizes and detection times. Additionally, we propose a ﬁrst drift-detection-based

technique for localizing leakages.

1 INTRODUCTION

Clean and safe drinking water is a scarce resource in

many areas. Almost 80% of the world’s population is

classiﬁed as having high levels of threat in water secu-

rity (V

osmarty et al., 2010). This will aggravate in

the future as due to climate change the already limited

water resources will become more restricted (Rodell

et al., 2018). Currently, across Europe, considerable

amounts of drinking water are lost due to leakages in

the system

To ensure a reliable drinking water supply, there

is a need for robust, safe, and efﬁcient water distribu-

tion networks (WDNs). In addition to avoiding water

losses, a crucial requirement is to ensure the quality

of the drinking water. As leakages enable unwanted

substances to enter the water system, monitoring the

system for leakages is an efﬁcient tool to avoid wa-

ter loss and contamination (Eliades and Polycarpou,

2010; Lambert, 1994).

Due to complex network dynamics and changing

demand patterns detecting leakages is a challenging

task. This is aggravated by the fact that the avail-

https://orcid.org/0000-0001-7659-857X

https://orcid.org/0000-0002-1199-4085

https://orcid.org/0000-0002-0935-5591

https://www.eureau.org/resources/publications/1460-e

ureau-data-report-2017-1/file

∗ authors contributed equally

able data is very limited. Usually, the precise net-

work topology remains unknown or the documenta-

tion contains errors. As smart meter technologies are

not widely distributed there is no real-time demand in-

formation (Cardell-Oliver and Carter-Turner, 2021).

In realistic settings, this leaves a set of scarce pres-

sure and possibly ﬂow measurements.

Commonly, existing leakage detection method-

ologies rely on replicating the system of interest by

hydraulic models and monitoring the discrepancies

between observations and modeled values. While

these approaches can provide reasonable detection

when considering larger leakages, the approaches

struggle when facing smaller leakages (Vrachimis

et al., 2022). Besides, limited (real-time) information

on the system is hindering the usage of these appli-

cations in real-world applications and the methodolo-

gies lack generalizability. Next to the hydraulic ap-

proaches, there are also a few machine learning (ML)-

based approaches that implement a similar strategy.

In this work, we focus on the problem of leak-

age detection from the perspective of handling data

streams containing temporal dependencies. More pre-

cisely, we formalize leakages as concept drift and the

problem of leakage detection as drift detection. We

aim to investigate the suitability of drift detection for

reliable leakage detection, whereby we focus on leak-

age of all practically relevant sizes. Our approach is

independent of the speciﬁc WDN and requires only

296

Vaquet, V., Hinder, F. and Hammer, B.

Investigating the Suitability of Concept Drift Detection for Detecting Leakages in Water Distribution Networks.

DOI: 10.5220/0012361200003654

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2024), pages 296-303

ISBN: 978-989-758-684-2; ISSN: 2184-4313

real-time pressure measurements. Thus, it is more

ﬂexible and efﬁcient than hydraulic simulation-based

approaches.

This paper is structured as follows. First, we in-

troduce WDNs and summarize the main speciﬁcs of

this domain (section 2). Afterward, we brieﬂy sum-

marize the body of related work on leakage detection

(section 3). In section 4, we deﬁne concept drift and

cover model-loss-based and distribution-based drift

detection. Before evaluating the suitability of these

methodologies for leakage detection in section 6, we

discuss the issue of temporal dependencies in the data

collected from WDNs and propose ways to account

for those (section 5). Finally, we conclude our paper

in section 7.

2 WATER DISTRIBUTION

NETWORKS

WDNs can be modeled as graphs consisting of nodes

representing junctions and undirected edges repre-

senting pipes as the ﬂow direction of the water is not

pre-deﬁned and can change over time due to changing

demands in the network. As the systems are observed

over time, hydraulic quantities like pressure and ﬂow

describe the network graph at each time step. They

can be described by hydraulic formulas given that the

network is known in great detail. Next to the exact

pipe layout, different parameters like elevations, pipe

diameters, and pipe roughness are required. Assum-

ing sensors are installed in n nodes across the graph,

for each time step t, measurements x

= [x

.. .x

] are

collected where x

is the value at node i.

Real-world measurements of the complete hy-

draulic state of WDNs are not available as it is not

possible to measure the entire system. Even a precise

topology alongside measurements in many positions

is usually not available. When developing and eval-

uating methodologies monitoring WDNs one usually

relies on simulated data. Given the key parameters of

a system (layout, elevations, pipe information) along-

side demand patterns realistic network states contain-

ing anomalies like leakages can be simulated using

simulation tools like EPANET (Rossman, 2000).

As WDNs are part of the critical infrastructure

and are required to work robustly and safely to en-

sure the health and well-being of the population, addi-

tional requirements are put on monitoring tools, espe-

cially those using AI technologies. Besides require-

ments concerning robustness, safety, and fairness as

formulated in the European AI-Act ( European Com-

mission, 2021), some technical attributes of WDNs

pose additional challenges to ML approaches. When

working with WDNs, only limited knowledge about

the pipe system is available. Usually, the exact prop-

erties of the pipes, e.g. their diameters and rough-

ness, and the different elevation levels are unknown.

Note that these are available for a few benchmark net-

works, and thus benchmark scenarios can be gener-

ated. However, when designing monitoring systems

relying on this kind of information strongly limits the

applicability in practical applications.

Besides limited information about the system

setup, the system is also relatively opaque concern-

ing the real-time dynamics. Due to installation costs

and challenges regarding the power supply, the avail-

ability of pressure and ﬂow sensors in WDNs is very

limited yielding readings at a fraction of the nodes in

the system. Data availability is even more limited for

real-time demand measurements, as households are

very rarely equipped with smart meters for drinking

water due to costs and data privacy (Cardell-Oliver

and Carter-Turner, 2021).

Another property of WDNs is the presence of

cyclic patterns in demands, ﬂows, and pressures.

When working on ML approaches, one needs to ac-

count for the presence of temporal dependencies.

Daily, weekly, and seasonal patterns as well as long-

term developments, e.g. climate change or the

COVID pandemic, increase the difﬁculty of leakage

detection as especially smaller leakages might be lost

in the signals.

3 RELATED WORK

The body of related work on leakage detection can

be divided into methods relying on a hydraulic model

and very few ML-based approaches. Hydraulic

model-based methods generally aim to replicate the

real-world system with a hydraulic model (Hu et al.,

2018). Usually, the simulation results of the hydraulic

model are then compared to the observations. An

anomaly is reported if the residual of these meth-

ods is too large, which is determined either by a

threshold (Romero-Ben et al., 2022), a CUSUM ap-

proach (Steffelbauer et al., 2022), or visual inspec-

tion (Marzola et al., 2022). All these methods share

the downside that they require real-time demands and

more information on the network topology than is

usually available (Vrachimis et al., 2022). Besides,

they lack generalizability across WDNs as the hy-

draulic model is speciﬁcally designed for one net-

work and even needs adaptation if something changes

within this particular network. While these hydraulic-

based approaches yield good results considering large

leakages they usually miss smaller ones (Vrachimis

et al., 2022).

Investigating the Suitability of Concept Drift Detection for Detecting Leakages in Water Distribution Networks

297

There are few ML-based approaches for leakage

detection (Daniel et al., 2022; Laucelli et al., 2016;

Romano et al., 2014). However, many are only eval-

uated on very small networks and lack realistic de-

mands as input for the simulation data. Most of these

approaches replace the hydraulic model with some

ML model following the general idea of residual-

based anomaly detection, for example by using a

threshold (Daniel et al., 2022; Laucelli et al., 2016).

4 DETECTING CONCEPT DRIFT

Deploying ML-based systems in real-world scenar-

ios, one needs to account for all kinds of changes and

ensure that the models reliably work even if the ob-

served environment changes. Thus, considerable re-

search focuses on ML in the presence of changes in

the data-generating process, which are called concept

drift or drift for shorthand. To obtain a formal deﬁ-

nition of drift, we ﬁrst need to deﬁne a so-called drift

process (Hinder et al., 2020; Hinder et al., 2023c):

Deﬁnition 1. Let T = [0,1] and X = R

. A drift

process (P

) from the time domain T to the data

space X is a probability measure P

on T together

with a Markov kernel D

from T to X , i.e. for all

t ∈ T D

is a probability measure on X and for all

measurable A ⊂ X the map t 7→ D

(A) is measurable.

We will just write D

instead of (P

) if this does

not lead to confusion.

Based on this a deﬁnition of drift can be obtained:

Deﬁnition 2. Let (P

) be a drift process. We say

that D

has drift iff

T,S∼P

̸= D

] = P

({(t, s) ∈ T

| D

̸= D

}) > 0.

In many monitoring settings, the goal is to detect

the drift by using model-loss-based or distribution-

based approaches. While the latter directly investi-

gates the observed data, model-loss-based approaches

ﬁrst train a model and then analyze its loss as a proxy

for change in the data distribution. The rationale is

that a drift event changes the data so that the model

cannot approximate well anymore, causing a decline

in the model loss. As argued by (Hinder et al., 2023a;

Hinder et al., 2023b) the relation between model-loss

and drift is rather loose – in case the model does not

provide sufﬁcient complexity to approximate the data

distribution well (i) the drift might stay undetected as

it is smoothed out by the model or in converse (ii) the

model might change because of irrelevant changes,

e.g. a change in the ratio of classes. Thus, from a the-

oretical point of view, one should rely on distribution-

based drift detection. However, model-loss-based ap-

proaches like the residual-based strategy described in

section 3, are also widely used in monitoring tasks.

Therefore, we will investigate the suitability of both

types of drift detection methods in this work.

4.1 Model-Loss-Based Drift Detection

Applying model-loss-based drift detection, there are

two reasonable inference tasks a model can perform

as a proxy for the drift detection: Either one per-

forms a forecasting task where the goal is to predict

the measurement of next time step x

t+1

based on the

sensor measurements collected up to time t, or one

performs an interpolation task where the goal is to

predict one sensor by the measurements of all other

sensors, i.e. for each node position i, a model f

n−1

→ R , f

) = ˆx

is trained, where x

means

we take all measurements but that of node i at time t.

The latter strategy has been employed as a virtual sen-

sor imputation strategy in case of sensor faults. Even

very simple ML models could successfully perform

the interpolation task (Vaquet et al., 2022). As we

observed worse results for forecasting in preliminary

experiments, we only cover interpolation in this work.

4.2 Distribution-Based Drift Detection

Most distribution-based approaches follow the strat-

egy of comparing two samples (Hinder et al., 2023c).

This can be done by statistical testing, e.g. by using

the Kolmogorov-Smirnow (KS) test (Kolomogorov,

1933) feature-wise or the kernel two-sample test

which relies on the maximum mean discrepancy

(MMD) and uses a kernel matrix as a descriptor (Gret-

ton et al., 2006). Another option is using a virtual

classiﬁer discriminating between the two windows.

In case it performs better than guessing, the distri-

butions of the windows differ, i.e. a drift occurred.

We will consider the D3 detection scheme (G

uac¸ık

et al., 2019) in our experiments.

We additionally consider a block-based detection

scheme searching directly for a dependency of data

and time which was identiﬁed to be an equivalent de-

scription of drift by (Hinder et al., 2020). This task

can be performed by a standard independence test; in

this work, we will make use of the HSIC-test (Gretton

et al., 2007) which is another kernel-based method.

As discussed in section 2, different kinds of daily,

weekly, and seasonal patterns have to be expected.

These patterns introduce certain temporal dependen-

cies to the data. As already discussed, these pat-

terns might increase the difﬁculty of detecting leak-

ages. Considering this from a theoretical viewpoint,

this problem can be summarized by the need to ac-

count for the temporal dependencies when perform-

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

298

Figure 1: Sensor data for one year (no leakage).

ing drift detection. Thus, in the next section, we will

analyze the temporal patterns in the data.

5 TEMPORAL DEPENDENCIES

IN THE DATA

We already raised the issue of temporal dependencies

in data collected from WDNs. In this section, we will

analyze the dataset which we will use in our experi-

ments later on. For this purpose, we will ﬁrst brieﬂy

introduce the dataset and provide our analysis.

5.1 L-Town Benchmark Data

In this work, we will consider the L-Town network

since it is relatively complex in comparison to other

benchmarks and one year of realistic real-time de-

mands are available for this system allowing us to

simulate realistic data for our experimental evalua-

tion. The L-Town network resembles parts of the

old town of Limassol, Cyprus. In our experiments,

we consider area A consisting of 661 nodes and

764 edges with 29 optimally placed pressure sen-

sors (Vrachimis et al., 2022). We run simulations

with four different leakage sizes ranging from 7mm

to 19mm at all pipes using the ATMN package which

builds on EPANET. Each scenario contains data for

364 days with a measuring frequency of 15 minutes.

We always consider one leakage per scenario which

starts at some point of the scenario and stays present

until the scenario ends.

5.2 Analysis

Analyzing the data, as expected we observe daily,

weekly, and seasonal patterns. As visualized in ﬁg. 1,

the pressure follows a clear weekly pattern as can be

seen in the zoom-in subplot. To control those depen-

dencies we perform two analysis strategies: 1) sub-

tracting the “standard week”, 2) subtracting the values

of the previous week.

Figure 2: Sensor residuals after subtracting the standard

week (no leakage). The orange line marks the mean trend

across all sensors.

Figure 3: Sensor residuals after subtracting the value of last

week (no leakage). The orange line marks the mean trend

across all sensors.

By subtracting the standard week from the origi-

nal signals we obtain the signals shown in ﬁg. 2. The

plot shows one example sensor reading (blue line) as

well as the minimal and maximal sensor reading at

each given point in time as the shaded area. As can be

seen, the feature runs across the entire range imply-

ing very strong ﬂuctuations. Furthermore, as can be

seen in the zoomed-in plot there is a change in ﬂuc-

tuation that follows a daily pattern. We also added

a trend line (orange) which follows a cosine shape.

This is a plausible ﬁnding as we expect a cyclic pat-

tern across several years that correlates with the sea-

sons. However, this pattern may render change detec-

tion schemes useless as it induces changes that are not

caused by leaks.

As an alternative, we considered subtracting the

value of the last week rather than a standard week.

The results are illustrated in ﬁg. 3. Due to the small

variance in the computation of the standard week, this

is already a good proxy for the standard week. How-

ever, it is better suited to cope with long-term changes

as can be seen from the trend line. Furthermore,

we again observe strong oscillations whose intensities

follow a daily pattern. We will ﬁnd that this strategy

is quite efﬁcient in section 6.1.

From both analyses, we expect that we can easily

cope with the periodic patterns if we only compare

the data on a by-week basis. This is because there is a

Investigating the Suitability of Concept Drift Detection for Detecting Leakages in Water Distribution Networks

299

noticeable difference between the values of weekends

and weekdays so that day-wise is too short to resolve

this dependency. Furthermore, longer periods will be

strongly affected by the seasonal trends. We will fur-

ther discuss those ideas in the next section.

5.3 Coping with Temporal

Dependencies

We observed substantial temporal patterns in the data

which we need to account for when utilizing drift de-

tection schemes for the task of detecting leakages. For

model-loss-based drift detection approaches we as-

sume that the models can generalize well. Thus, in

this setting, no additional actions need to be taken. In

contrast, when using distribution-based schemes, we

need to carefully incorporate our knowledge of the

different temporal cycles in the data to successfully

detect leakages.

In preliminary experiments, we used a pre-

processing technique, which subtracted a standard

week to eliminate cycles in the data. However, this

strategy assumes that we can model this standard

week successfully which requires some leakage-free

historical data. Since we aim to develop a method-

ology that requires as little information as possible

to generalize to new networks, we additionally ex-

perimented with choosing the window sizes such that

the detection schemes do not suffer from seasonali-

ties. Here, as discussed before, our idea is to elimi-

nate the cyclic patterns by choosing exactly one week

per window. Thereby daily and weekly patterns are

eliminated while the windows are still small enough

to not be affected by long-term dependencies. Since

this strategy resulted in better results while requiring

no additional information, we will use this option in

our experimental evaluation instead of performing a

preprocessing step subtracting the standard or the pre-

viously observed week.

6 EXPERIMENTS

For all our experiments

, we use the data benchmark

which we described in section 5.1.

6.1 Model-Loss-Based Drift Detection

To evaluate the model-loss-based detection schemes

we rely on different regression models: kNN, polyno-

mial ridge regression, random forests, and linear ridge

The experimental code and hyperparameters are avail-

able at https://github.com/FabianHinder/Drift-and-Water

regression. RBF-ridge and RBF-/Poly-/Linear-SVR

were considered but discarded after initial considera-

tions due to weak performances in the regression task.

In our experiments, we ﬁrst analyze the models’ per-

formance on the interpolation task, and their gener-

alization capabilities to out-of-sample examples, e.g.

to scenarios containing leakages. In the second step,

we analyze how well the schemes are suited for de-

tecting leakages. To do so we check the underlying

assumption that the model would perform better on

the original training data (without leakage) compared

to the leaky data. For this to facilitate a useful strat-

egy we need to be able to deﬁne a threshold θ such

that MSE(x

) > θ indicates a leakage at time t and

vice versa. Considering this as a classiﬁcation prob-

lem with the classes “no leakage” and “leakage” we

can apply the ROC-AUC score to evaluate the perfor-

mance of our models. To be more resilient to slowly

growing leakages we do not consider model updates.

Thus, we end up with the following procedure:

1. Select one fold. Extract two consecutive weeks

from the baseline dataset

2. Train the interpolation model on the data

3. Compute the errors of the model for the remaining

year for each data point E

4. Compute the errors of the model for the entire

year for each leakage location and size E

5. Compute the detection performance for this fold

ROC-AUC([0, .. ., 0 , 1,...,1],E

+ E

)

Recall that the ROC-AUC measures how well the ob-

tained scores separate the leaky and non-leaky setups.

The score is 1 if the largest error without leakage is

smaller than the smallest error with leakage, it is 0.5

if the assignment is random. Thus, the ROC-AUC

provides a scale-invariant upper bound on the perfor-

mance of every concrete threshold. It is not affected

by class imbalance.

The results of our experiment evaluating the gen-

eralization ability are summarized in ﬁg. 4. The ob-

served errors increase with increasing leakage diam-

eters and the models generalize to small leakages. In

this setting, we ﬁnd that the simple linear models per-

form much better than the kNN and the random forest.

This aligns with ﬁndings published in (Vaquet et al.,

2022).

The evaluation of the detection experiments is

summarized in ﬁg. 5. We observe reasonable ROC-

AUC scores for leakage sizes of about 15mm-19mm.

However, small leakages pose a difﬁcult problem for

model-loss-based drift detection schemes: the scores

are only marginally above random guessing.

In summary, we ﬁnd that model-loss-based de-

tection schemes are only suitable for detecting large

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

300

Figure 4: Mean squared error of the interpolation for different leakage sizes. Sorted (x-axis) ascending for each leakage size

and mean error for clarity. Setting without leakage is reported as baselines. Line marks mean value, shaded area is minimal

value to mean+standard deviation.

Figure 5: ROC-AUC for the model-loss-based drift detection for increasing leakage sizes. Interpolation task. Solid line marks

mean value, shaded area mean±standard deviation, dashed line median.

leakages since the models generalize too well to out-

of-distribution samples to reliably detect small leak-

ages. This aligns with both results of other works and

the theory we brieﬂy discussed in section 4.

6.2 Distribution-Based Drift Detection

Concerning distribution-based drift detection (see

section 4.2), as argued before, choosing a suitable

window size is crucial for the successful usage of

distribution-based drift detection schemes. Analyzing

the system we decided to rely on two windows of one

week each to eliminate the temporal patterns from the

data. In addition to the choice of window size, the

position of the split point, i.e. the border of the two

considered windows, affects the performance of the

detection scheme. The larger the displacement from

the actual leakage onset time, the harder the detection

will be. However, as it is desirable to detect leakages

as soon as possible, it would be desirable to detect

leakages even if the displacement is still large, e.g. if

the split point still lies before the leakage occurred.

Fig. 6 summarizes the results of the distribution-

based detection schemes with the different models.

Assuming the split point lies exactly on the leak-

age, we report much better detection results than for

the model-loss-based detection strategies. We report

smaller scores for DAWIDD which is to be expected

in this case as the model is not beneﬁting from our

window size choice due to its block-based nature and

is thus affected by the temporal dependencies in the

oscillations. While we obtain smaller scores for the

statistical test-based methods for small leakages, the

virtual classiﬁer-based methods reliably detect leak-

ages of a diameter of 7mm.

As assumed, concerning the position of the split

point, we observe a decline in the score with increas-

ing displacement. However, even for a displacement

of 4 days, we obtain better scores than for the best

model-based detection schemes. For a displacement

of 6 days, we still get reasonable scores for large leak-

ages when using the D3 detection scheme with a lin-

ear model. Thus, detecting large leakages can be re-

alized very fast, while for smaller leakages it takes a

little more time to obtain a reliable warning.

These ﬁndings can be conﬁrmed when analyzing

ﬁg. 7. One can see that apart from the smallest leak-

age size, with a displacement of 4 days, i.e. 3 days

after the leakage occurred, most detection schemes

yield a reasonable score. Using the linear version of

D3 even the smallest considered leakages can be de-

tected with a delay of 5 days (displacement of 2 days).

In conclusion, we found that distribution-based

detection schemes outperform model-based detection

across all leakage sizes if the window size is chosen

suitably. Large leakages can be detected with a rea-

sonably small delay. In practical applications imple-

menting a warning mechanism early on could be real-

ized to react early on.

6.3 Leakage Localization

Besides the task of leakage detection, there is also the

more speciﬁc task of leakage localization, i.e. deter-

Investigating the Suitability of Concept Drift Detection for Detecting Leakages in Water Distribution Networks

301

Figure 6: Performance of unsupervised drift detectors for different displacements (discrepancy between split point (assumed

time-point of drift) and actual drift). Solid line marks mean value, shaded area mean±standard deviation, dashed line median.

Figure 7: Performance of unsupervised drift detectors for different leakage sizes. Solid line marks mean value, shaded area

mean±standard deviation, dashed line median.

mining the pipe where the leakage occurs (Vrachimis

et al., 2022). This task is usually considered much

harder and is commonly approached by formulating

an inverse problem, i.e. evaluating the plausibility

of different locations (Li et al., 2022; Daniel et al.,

2022; Wang et al., 2022; Marzola et al., 2022). This

again requires a lot of data usually not available to

us. As our drift detection approach performed quite

well on the detection task we consider the possibil-

ity to extend the methodology to leakage localization.

Here the idea is quite simple: the closer a sensor is

to the leakage’s actual position, the stronger the inﬂu-

ence and thus the drift, leading to a particularly small

p-value for that feature. As only the Kolmogorov-

Smirnov test operates feature-wise we consider this

scheme using the same same setup as before. We re-

turn the sensor node that has the smallest p-value, i.e.

is considered to be particularly drifting by the test.

In the following let S be all sensor nodes, s

∗

be the

selected sensor node, and v be the node where there

leakage actually occurred, even if it is not the sen-

sor node. Furthermore, d denotes the graph distance

in the WDN, i.e. d(a,b) is the length of the short-

est path connecting a and b. We make use of three

metrics: distance between selected and actual node

(Dist.; d(s

∗

,v)), number of sensor nodes closer to the

actual node (#Cls.; |{s ∈ S | d(s, v) < d(s

∗

,v)}|), and

relative distance between actual node, selected and

optimal node (rel.D.; d(s

∗

,v)/ min

s∈S

d(s,v)) which

is normalized in contrast to the simple distance and

smooth in contrast to the closer node metric.

The results are shown in table 1. They are quite

promising. We observe that the precision is decreas-

Table 1: Results of leakage localization.

size Dist. rel.D. #Cls.

(mm) µ±σ µ±σ µ±σ

7 10.1±13.1 2.6±4.9 3.3±7.0

11 5.5±4.7 1.3±1.4 0.6±2.0

15 5.1±3.9 1.2±1.2 0.5±1.5

19 5.0±3.6 1.2±1.1 0.4±1.4

ing for smaller leakage sizes, which is to be expected

considering the results from the last experiment.

7 CONCLUSION

In this work, we investigated the suitability of

model-loss-based and distribution-based drift detec-

tion methods. Combining distribution-based detec-

tion with knowledge of WDNs, we provide detection

schemes that successfully detect leakages of all sizes

with reasonable detection delays. Analyzing model-

loss-based techniques that are widely implemented in

the water domain, we conﬁrmed theoretical results

that raise the issue of the loose connection between

model loss and drift.

We assume that our work is not limited to WDNs

but can also be realized for anomaly detection in other

critical infrastructure systems like gas or electrical

grids. In practical applications a further analysis of

the leakages is necessary – solely detecting leakages

is not sufﬁcient to take appropriate actions. We pro-

posed a ﬁrst localization strategy that is based di-

rectly on detection efforts. Considering these follow-

up tasks in more detail through the lens of concept

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

302

drift both practically and theoretically is an interest-

ing path for further research.

ACKNOWLEDGEMENTS

We gratefully acknowledge funding from the Eu-

ropean Research Council (ERC) under the ERC

Synergy Grant Water-Futures (Grant agreement No.

951424).

REFERENCES

European Commission (2021). Artiﬁcial Intelligence Act.

Cardell-Oliver, R. and Carter-Turner, H. (2021). Activity-

aware privacy protection for smart water meters. In

8th ACM BuildSys, BuildSys ’21, page 31–40. Asso-

ciation for Computing Machinery.

Daniel, I., Pesantez, J., Letzgus, S., Khaksar Fasaee, M. A.,

Alghamdi, F., Berglund, E., Mahinthakumar, G., and

Cominola, A. (2022). A Sequential Pressure-Based

Algorithm for Data-Driven Leakage Identiﬁcation and

Model-Based Localization in Water Distribution Net-

works. J. Water Resour. Plan. Manag., 148.

Eliades, D. G. and Polycarpou, M. M. (2010). A Fault Di-

agnosis and Security Framework for Water Systems.

IEEE Transactions on Control Systems Technology,

18(6):1254–1265.

uac¸ık,

O., B

ukc¸akır, A., Bonab, H., and Can, F.

(2019). Unsupervised concept drift detection with a

discriminative classiﬁer. In Proceedings of the 28th

ACM international conference on information and

knowledge management, pages 2365–2368.

Gretton, A., Borgwardt, K. M., Rasch, M. J., Sch

olkopf,

B., and Smola, A. J. (2006). A kernel method for the

two-sample-problem. In NIPS 2006, pages 513–520.

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Sch

olkopf,

B., and Smola, A. J. (2007). A kernel statistical test of

independence. In NIPS 2007, pages 585–592.

Hinder, F., Artelt, A., and Hammer, B. (2020). Towards

non-parametric drift detection via dynamic adapting

window independence drift detection (DAWIDD). In

ICML 2020, volume 119, pages 4249–4259. PMLR.

Hinder, F., Vaquet, V., Brinkrolf, J., and Hammer, B.

(2023a). On the Change of Decision Boundary and

Loss in Learning with Concept Drift. In IDA XXI, vol-

ume 13876, pages 182–194. Springer Nature Switzer-

land, Cham.

Hinder, F., Vaquet, V., Brinkrolf, J., and Hammer, B.

(2023b). On the Hardness and Necessity of Super-

vised Concept Drift Detection:. In 12th ICPRAM,

pages 164–175, Lisbon, Portugal. SCITEPRESS.

Hinder, F., Vaquet, V., and Hammer, B. (2023c). One or

two things we know about concept drift – a survey on

monitoring evolving environments.

Hu, C., Li, M., Zeng, D., and Guo, S. (2018). A survey on

sensor placement for contamination detection in water

distribution systems. Wireless Networks, 24(2):647–

661.

Kolomogorov, A. (1933). Sulla determinazione empirica di

una legge didistribuzione. Giorn Dell’inst Ital Degli

Att, 4:89–91.

Lambert, A. (1994). Accounting for Losses: The Bursts and

Background Concept. Water and Environment Jour-

nal, 8(2):205–214.

Laucelli, D., Romano, M., Savi

c, D., and Giustolisi, O.

(2016). Detecting anomalies in water distribution net-

works using EPR modelling paradigm. Journal of Hy-

droinformatics, 18(3):409–427.

Li, Z., Wang, J., Yan, H., Li, S., Tao, T., and Xin, K. (2022).

Fast Detection and Localization of Multiple Leaks in

Water Distribution Network Jointly Driven by Simu-

lation and Machine Learning. J. Water Resour. Plan.

Manag., 148(9).

Marzola, I., Mazzoni, F., Alvisi, S., and Franchini, M.

(2022). Leakage Detection and Localization in a Wa-

ter Distribution Network through Comparison of Ob-

served and Simulated Pressure Data. J. Water Resour.

Plan. Manag., 148(1):04021096.

Rodell, M., Famiglietti, J. S., Wiese, D. N., Reager, J.,

Beaudoing, H. K., Landerer, F. W., and Lo, M.-H.

(2018). Emerging trends in global freshwater avail-

ability. Nature, 557(7707):651–659.

Romano, M., Kapelan, Z., and Savi

c, D. A. (2014). Auto-

mated Detection of Pipe Bursts and Other Events in

Water Distribution Systems. J. Water Resour. Plan.

Manag., 140(4):457–467.

Romero-Ben, L., Alves, D., Blesa, J., Cembrano, G., Puig,

V., and Duviella, E. (2022). Leak Localization in

Water Distribution Networks Using Data-Driven and

Model-Based Approaches. J. Water Resour. Plan.

Manag., 148(5).

Rossman, L. A. (2000). EPANET 2: users manual. US

Environmental Protection Agency. Ofﬁce of Research

and Development.

Steffelbauer, D. B., Deuerlein, J., Gilbert, D., Abraham, E.,

and Piller, O. (2022). Pressure-Leak Duality for Leak

Detection and Localization in Water Distribution Sys-

tems. J. Water Resour. Plan. Manag., 148(3).

Vaquet, V., Artelt, A., Brinkrolf, J., and Hammer, B.

(2022). Taking Care of Our Drinking Water: Deal-

ing with Sensor Faults in Water Distribution Net-

works. In ICANN 2022, volume 13530, pages 682–

693. Springer Nature Switzerland, Cham.

osmarty, C. J., McIntyre, P. B., Gessner, M. O., Dud-

geon, D., Prusevich, A., Green, P., Glidden, S., Bunn,

S. E., Sullivan, C. A., Liermann, C. R., et al. (2010).

Global threats to human water security and river bio-

diversity. nature, 467(7315):555–561.

Vrachimis, S. G., Eliades, D. G., Taormina, R., Kapelan, Z.,

Ostfeld, A., Liu, S., Kyriakou, M., Pavlou, P., Qiu, M.,

and Polycarpou, M. M. (2022). Battle of the Leakage

Detection and Isolation Methods. Journal of Water

Resources Planning and Management, 148(12).

Wang, X., Li, J., Liu, S., Yu, X., and Ma, Z. (2022). Mul-

tiple Leakage Detection and Isolation in District Me-

tering Areas Using a Multistage Approach. J. Water

Resour. Plan. Manag., 148(6).

Investigating the Suitability of Concept Drift Detection for Detecting Leakages in Water Distribution Networks

303