Differential Privacy for Distributed Traffic Monitoring in Smart Cities
Marcus Gelderie
a
, Maximilian Luff and Lukas Brodschelm
Aalen Univeristy of Applied Sciences, Beethovenstr. 1,73430 Aalen, Germany
{firstname.lastname}@hs-aalen.de
Keywords:
Differential Privacy,Smart City,Traffic Monitoring.
Abstract:
We study differential privacy in the context of gathering real-time congestion of entire routes in smart cities.
Gathering this data is a distributed task that poses unique algorithmic and privacy challenges. We introduce
a model of distributed traffic monitoring and define a notion of adjacency for this setting that allows us to
employ differential privacy under continual observation. We then introduce and analyze three algorithms that
ensure ε differential privacy in this context. First we introduce two algorithms that are built on top of existing
algorithmic foundations, and show how they are suboptimal in terms of noise or complexity. We focus, in
particular, on whether algorithms can be deployed in our distributed setting. Next, we introduce a novel
hybrid scheme that aims to bridge between the first two approaches, retaining an improved computational
complexity and a decent noise level. We simulate this algorithm and demonstrate its performance in terms of
noise.
1 INTRODUCTION
Smart cities are an ongoing trend in large urban areas,
where communities seek to leverage data analytics in
order to optimize aspects of their infrastructure. One
prominent example of this is smart traffic manage-
ment (Gade, 2019; Bhardwaj et al., 2022). The goal
is to minimize congestion and reduce overall point-
to-point travel time. Solutions in this context rely
on live data to predict movements and react accord-
ingly. This usually requires tracking vehicles moving
about the city to discern movement patterns that span
large parts of (or even the entire) city (e.g., see (Dja-
hel et al., 2015; Khanna et al., 2019; Rizwan et al.,
2016)).
From a privacy perspective, however, tracking in-
dividual citizens day and night, possibly storing this
data at a central location, is, of course, a nightmare.
Local legislation (e.g. the GDPR in the EU) may
even prohibit some of those solutions, threatening the
adoption of modern traffic management. Legal risks
aside, massive data collection at a centralized loca-
tion poses significant risks from a information secu-
rity perspective (Gracias et al., 2023). Ultimately,
cities need solutions that minimize the sensitivity of
the data that is stored and reduce the privacy risks to
affected individuals.
Differential Privacy (DP) (Dwork et al., 2006) is a
a
https://orcid.org/0009-0003-0291-3911
well-known tool to design algorithms that give quan-
tifiable privacy guarantees. Much research has gone
into developing DP algorithms for various statistical
tasks, such as counting, summing, top-k queries and
the like (Dwork et al., 2010; Dwork et al., 2015; Chan
et al., 2011; Henzinger et al., 2023) (see also Related
Work below). DP has also been applied to traffic and
vehicle data analysis in the past (Hassan et al., 2019;
Ma et al., 2019; Zhou et al., 2018; Li et al., 2018; Sun
et al., 2021). However the monitoring of city-scale
point-to-point traffic movements centrally has, to our
knowledge, not been considered before, even though
it has been identified as a relevant research topic (Has-
san et al., 2019).
We consider the task of monitoring movements of
individual vehicles at locally distinct points through-
out a city and aggregating that data into a central
statistic that captures the number of vehicles traveling
along a set of routes within the city limits. We pro-
pose three different algorithms that provide ε DP in
this setting and compare their relative merits. Specif-
ically, we show how there appears to be a trade-off
between the noise incurred and the complexity of the
algorithms that run centrally or in a distributed way.
Our contributions are as follows: I) We propose
a generic architecture for distributed traffic monitor-
ing that is applicable in multiple scenarios. II) We
develop three DP algorithms for this architecture and
analyze their relative noise levels. III) We provide an
758
Gelderie, M., Luff, M. and Brodschelm, L.
Differential Privacy for Distributed Traffic Monitoring in Smart Cities.
DOI: 10.5220/0012372700003648
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 10th International Conference on Information Systems Security and Privacy (ICISSP 2024), pages 758-765
ISBN: 978-989-758-683-5; ISSN: 2184-4356
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
analysis of the algorithmic properties of our three al-
gorithms, where the input is the size of the city, and
the length of the monitoring period. We pay particular
attention to whether algorithms support a distributed
deployment at the various locations across the city.
Our three algorithms are designed to showcase the
engineering trade-offs that impact the noise level and
algorithmic properties. Much depends on the input
format and the notion of adjacency that is considered.
For example, counting, as a primitive, has been
studied extensively (Dwork et al., 2010) form a DP
perspective. The “binary tree technique” introduced
in (Dwork et al., 2010) and later studied in (Chan
et al., 2011; Henzinger et al., 2023) yields log(T )
noise, where T is the duration of the counting task.
But this technique cannot easily be ported to our set-
ting. Instead, the naive option of sampling noise per
time-step (cf. alg. 1) outperforms any attempt to port
the binary tree technique to our setting in terms of
noise. However, it cannot be meaningfully deployed
in a distributed way.
Next we introduce alg. 2, which has a noise bound
linear in the number of tracked routes. Note the num-
ber of routes itself is exponential in the duration T of
tracking, i.e. R V
T +1
, where V is the number of
vertices. However, the resulting algorithm can be de-
ployed in a distributed way. Its runtime is also linear
in the number of routes, both if deployed centrally or
in a distributed way.
Finally, we propose a third probabilistic hybrid
scheme (cf. alg. 3) that bridges between these two
approaches. We analyze and simulate this third ap-
proach and find that it strikes a balance between both
the “naive” (alg. 1) and the “noise per route” approach
(alg. 2) in both runtime and noise. It is also easily de-
ployable in a distributed way.
Related Work. Differential Privacy (DP) was intro-
duced first introduced in (Dwork et al., 2006). Sub-
sequently, several algorithms giving (ε,0)- and (ε,δ)-
DP queries were introduced (see also (Dwork et al.,
2014)). However, since many applications require the
continual release of statistics, the notion of continual
observation was introduced and has since been stud-
ied extensively (Dwork et al., 2010; Chan et al., 2011;
Henzinger et al., 2023). As a part of this, the notion of
adjacency of input sequences, specifically user-level
and event-level privacy, was introduced. User-level
privacy requires the continual mechanism to be DP-
private independent of how often a individual partic-
ipates in the accumulated statistical data. This con-
tinual counting mechanism has since been improved
(Dwork et al., 2015).
Besides counting events, histogram queries are of
interest, and have been studied. For instance, (Hen-
zinger et al., 2023) considered an intermediate DP
histogram in order to continuously release differen-
tially private max-sum, top-k-, and histogram queries.
For queries like max-sum or sum-select upper and
lower bounds on the accuracy have been established
(Jain et al., 2023). Furthermore, there has been re-
search into settings where the sensitivity between his-
togram queries is limited but the domain from which
the items for the histogram are sampled is unknown
(Cardoso and Rogers, 2022). There is also some prior
work to calculate dynamic sliding windows to mini-
mize the error bounds of such algorithms (Chen et al.,
2023).
However, none of these works fit our use case of
continuously releasing traffic statistics that track ve-
hicles through a city. In this setting, counters are
not monotonic and many of those algorithms cannot
be applied. A dynamically generated sliding win-
dow size is not feasible, since we have to reliably
get updated counts for different routes each phase.
Lastly, our adjacency notion differs significantly from
the previous definitions (cf. sec. 3), presenting a chal-
lenge to porting previous algorithms.
Although differential privacy is a well researched
field, smart cities as a possible application pose many
distinct challenges (Yao et al., 2023; Husnoo et al.,
2021). Privacy is one of the main factors that should
be kept in mind when designing smart city systems
(Kumar et al., 2022). And (Qu et al., 2019) already
stated that smart mobility in particular is one of the
main factors for security concerns, since the attacker
is able to learn locations of any single individual.
There are related works studying DP in scenarios like
vehicle-2-X communications in electric-charging or
vehicle trajectory estimation, such as (Li et al., 2018;
Ma et al., 2019). Those works do not focus on vehi-
cle tracking for the purpose of optimizing city traffic.
(Sun et al., 2021) study traffic volume measurement
and present an estimator that is DP under certain con-
ditions. The focus is on estimating traffic volume for
a given set of locations. In this paper, we study traf-
fic statistics for specific routes (ordered sequences of
locations) within a city.
2 PRELIMINARIES
Notation. For a set X , write X
for all finite (possi-
bly empty) sequences of elements of X. For s X
,
write s = s
1
···s
l
where s
i
X, 1 i l. The length
of s is |s| = l. A prefix is a sequence s|
i
= s
1
···s
i
,
where 0 i |s|. The concatenation s = s
1
···s
l
and
s
= s
1
···s
l
is written as s s
= s
1
···s
l
s
1
···s
l
. We
Differential Privacy for Distributed Traffic Monitoring in Smart Cities
759
write [a,b]
def
= [a,b] and drop the subscript when
the need for naturals numbers is clear.
Differential Privacy. Differential privacy (DP) was
introduced by (Dwork et al., 2006) for single
databases (see (Dwork et al., 2014) for a comprehen-
sive introduction) and later adapted to the continual
setting (Dwork et al., 2010). Here one considers “ad-
jacent” sequences of input. Several notions of adja-
cency exist (in particular event level and user level
adjacency). We postpone the precise definition of ad-
jacency to sec. 3, where we will discuss why existing
definitions of adjacency do not fit our use-case well
and propose a more tailored definition. The follow-
ing definition of DP under continual observation is
adapted
1
from (Dwork et al., 2010):
Definition 1. Let ε > 0. Let A be a randomized al-
gorithm, called the curator, that on input sequence
s
1
,. .. ,s
l
, produces an output sequence A(s) Σ
l
of
the same length.
A provides ε-differential privacy (ε-DP), if for all
adjacent input streams s, s
and all S Σ
Pr[A(s) S] exp(ε)Pr[A(s
) S]
The following useful “post-processing” theorem
allows rounding outputs to the nearest integer without
loosing DP. We thus consider only algorithms produc-
ing real numbers in this paper.
Theorem 1 (see (Dwork et al., 2014)). If A provides
ε-DP and B is any randomized mapping defined on
range(A), then B A provides ε-DP.
As usual, A has (α,β) error, if for any s of length
L, the maximal difference (in the -norm) between
the true statistic f (s) and the computed statistic A(s)
is below α with probability at least 1 β.
The Laplace distribution Lap(b) of scale b > 0
has PDF p(x) = (2b)
1
exp(−|x|·b
1
) and satisfies
p(x) exp
1
b
·p(x ±1).
3 PROBLEM STATEMENT
We consider traffic monitoring in a smart city con-
text. Our goal is to compute the number of of vehicles
travelling along a certain route in a given time-period.
To this end, vehicles are recorded at specific track-
ing points in the city, such as at traffic lights. This
information is aggregated centrally into one statistic
about the number of vehicles traveling along a spe-
cific route. The overall situation is depicted in fig. 1.
1
Different from the original definition, we do not con-
sider internal states and pan-privacy. We are only interested
in differentially private outputs.
Figure 1: System architecture.
Tracking vehicles requires two pieces of auxil-
iary data. Once a vehicle is detected, it is assigned
a unique ID u from a countable set U of possible IDs
and a time-to-live (TTL) that is initialized to a con-
stant T . Vehicles are recognized via some for of
identifying information (e.g. license plate). For pri-
vacy reasons, we hide this information behind a ran-
domized unique ID. The TTL is decremented at each
tracking point. Once it reaches zero, the vehicle is
no longer tracked. If the same vehicle continues to
move through the city, it will instead by assigned a
new unique ID u
̸= u and a new TTL. In fact, the new
unique ID will satisfy the stronger property that it is
distinct from any previously used ID. We will elabo-
rate on why this is necessary further below.
We model the city as a directed graph G =
(V,E). The vertices correspond to the tracking points.
A route is a path r = r
1
···r
m
through the graph:
(r
i
,r
i+1
) E for 1 i < m (repetitions are possible).
Note m T , because of the TTL. We write R for the
(finite) set of all routes and R
max
= {r R | |r| = T }.
For each observed vehicle, the tracking points
transmit its TTL to neighboring locations Adj(v) =
{v
V | (v,v
) E}, so that the same vehicle is al-
ways assigned the same unique ID (within its TTL). In
this way, each tracking point reports a sequence of ob-
served IDs to the central statistics server (see fig. 1).
Using the resulting pairs of ID and location, the server
builds a statistic of vehicles per route.
Discussion. The masking of license plates by
unique IDs already provides some privacy, but this is
difficult to quantify. A vehicle on a very low-traffic
route may still be identifiable. DP provides more ro-
bust and quantifiable guarantees in this situation.
In practice, it is possible that vehicles take a long
time to travel from v to v
, even for adjacent (v,v
) E
(e.g. if the driver stops for coffee). The resulting se-
quence for that ID will contain “gaps”: intermittent
time-steps where the vehicle is not recorded. In the
ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy
760
remainder of this paper we assume that every vehicle
is recorded at every time index until it stops moving
or its TTL elapses; i.e. there are no “gaps”. This
is no limitation, because such gaps will not change
how our algorithms work. But accounting for these
corner-cases in the mathematical notation below is te-
dious while adding little value. In a similar vein, we
assume that all tracking points report their data simul-
taneously. Again, this is not a realistic assumption.
But similar to our “no gaps” assumption, it simplifies
proofs while not affecting our algorithms.
4 DP IN TRAFFIC MONITORING
In this section we propose three different algorithms
that ensure DP traffic monitoring. The algorithms
work on input sequences of different types. How-
ever, all such input sequences are (partial) functions
from unique IDs to some domain D (either R or V).
Specifically, all algorithms studied in this paper pro-
cess sequences of the form s = (s
1
,. .. ,s
L
), where for
each 1 i L s
i
= (s
i,u
)
uU
D
U
for some domain
D. Note we sometimes use vector notation (s
u
)
uU
with s
u
D instead of functional notation s(u) D.
We stress that these vectors or functions can be par-
tial. In such cases, we write s
u
= if u /dom(s) and
require that / D.
Differential privacy in continual settings (i.e.
where sequences of events are processed) has been
studied before (e.g. (Dwork et al., 2010; Jain et al.,
2023; Henzinger et al., 2023; Cardoso and Rogers,
2022; Chan et al., 2011)). However, our case is subtly
different. To illustrate, we recall the following defini-
tion of DP from (Dwork et al., 2010):
Definition 2 (Adjacency). Let X be some set of
events, L and s = (s
1
,. .. ,s
L
), s
= (s
1
,. .. ,s
L
)
X
L
. Then s,s
are adjacent if there exists some sub-
set I {1,. .. ,L} and x, x
X , such that s
i
= s
i
for
all i / I and s
i
= x
for all i I if s
i
= x.
This definition does not fit our case well, because
it is restricted to two fixed symbols x,x
that are being
exchanged. Simply removing a single tracking point
is clearly not enough to hide the presence of a given
individual. Instead, we would like to remove a spe-
cific unique ID from the entire sequence (or alter its
route), which requires changing the value of several
functions s D
U
at one point u U. We therefore in-
troduce a slightly adapted notion of adjacency. Given
any function f : A B, a A and b B, write f [a/b]
for the function f [a/b](x) = f (x) for all x ̸= a and
f [a/b](a) = b. We now define:
Definition 3 (ID Adjacency). Let X = D
U
, L
and s = (s
1
,. .. ,s
L
), s
= (s
1
,. .. ,s
L
) X
L
. Then
s,s
are ID adjacent if there exists some subset I
{1,. .. ,L}, u U and d D {⊥}, such that s
i
= s
i
for all i / I and s
i
= s
i
[u/d] for all i I.
Note that d = is possible, effectively dropping
(some) occurrences of u in the sequence.
In this paper we study only ID adjacency. There-
fore, whenever we use the word “adjacent” in what
follows, we mean ID adjacency.
We can now understand why we assume that IDs
from U are never reassigned. This assumption al-
lows us to define adjacency without considering cor-
ner cases, such as whether the occurrences of an
ID in s and s
overlap, and without defining what it
means for repeated occurrences of the same ID to be
“causally connected”. In practice, any set of IDs can
be made infinite with no risk of reassigning by taking
the Cartesian product with the set of all timestamps.
Remark 1. In this paper we consider algorithms that
process input sequences of unbounded length. Be-
cause of the bounded TTL, differences in two ID ad-
jacent sequences will always affect at most T distinct
time steps of the execution of the algorithm
4.1 Route Counting
In this section we study an approach that processes
(sequences of) mappings of IDs to routes and sim-
ply counts the number of IDs per route. The actual
tracking of IDs along routes is done as a preprocess-
ing step, before the curator is fed the actual data: The
curator works as a post-processing step.
In the event that the input to the curator is
a (partial) mapping of unique IDs to routes, the
statistics function is particularly simple: Given
v R
U
and r R write m
r
(v) = |{u U |
v
u
= r}| for the number of IDs that v maps to
r. If we enumerate R = {r
1
,. .. ,r
d
}, we can
write the statistics function f : (R
U
)
(
d
)
as f (s
1
,. .. ,s
L
) = ¯m(s
1
),. .. , ¯m(s
L
)
d
L
where
¯m(s
i
) = (m
r
1
(s
i
),. .. ,m
r
d
(s
i
))
d
, 1 i L. To
simplify notation, we write f
s
t,r
def
= (( f (s))
t
)
r
for 1
t L and r R in the remainder of this paper.
The simplest solution to create DP in this setting
is to add noise per reported route. Our setting might
seem similar to event counting (Dwork et al., 2010),
and so one might expect that the log
2
(T ) noise bound
from the binary tree mechanism introduced in (Dwork
et al., 2010) carries over to our setting. This seems
not to be the case! Due to space constraints, we refer
the reader to the full version of this paper (Gelderie
et al., ) for a justification of this claim. As a result, our
first algorithm, relies on the straightforward method
of adding independent noise per time step. This is
Differential Privacy for Distributed Traffic Monitoring in Smart Cities
761
input : vector v = (r
u
)
uU
1 c 0
R
2 for r R
3 c
r
m
r
(v) + Lap(2 ·ε
1
·T )
output: c
Algorithm 1: DP on pre-aggregated routes.
global : R database holding routes per ID
global : L database IDs and associated noise
input : f V
U
1 c 0
R
2 forall u
F
,µ, r L
3 decompose r = v r
with v V
4 if |r
| = 0 : DROP(L, u
F
)
5 else UPDATE(L, u
F
,µ, r
)
6 c
r
c
r
+ µ
7 forall v V
8 forall routes r = vr
R starting at v
9 µ Lap
2
ε
10 INSERT(L, NEXTFREEID(),µ,r
)
11 c
v
c
v
+ µ
12 forall u f
1
(v)
13 r LOOKUP(R, u)v
14 c
r
c
r
+ 1
15 r UPDATEORINSERT(R,u,r)
output: c
shown in alg. 1.
Theorem 2. Alg. 1 gives ε DP with (ε
1
·
2T ln(
LR
β
),β) error on input sequences of length at
most L tracking R = |R | routes.
We omit this an all further proofs due to space
constraints, and refer the reader to the full version of
this paper (Gelderie et al., ).
4.2 Location Tracking
Now we study two algorithms that process sequences
of vectors of locations (elements of V ). The inputs
are now sequences of partial functions s = (s
u
) V
U
.
To compute the same statistic, f : (V
U
)
(
R
)
now tracks IDs across time-steps. Formally,
let s be a sequence of length L, let 1 k L and
r R . Let c
r
(s|
k
) = |{u U | t [0,T 1]
0
:
s
kt
(u)···s
k
(u) = r t
< k t : s
t
(u) = ⊥}|. We
require that u does not occur in s|
k
prior to time k t
(so, given k, the value for t is maximal). The statistics
function is f (s) = (c
r
(s|
1
))
rR
,··· ,(c
r
(s|
L
))
rR
.
Alg. 2 computes f and adds noise. To this end,
at each time t and for every location v V , noise µ is
sampled from Lap(2·ε
1
) for each route r that begins
in v. This noise is assigned an ID and sent along that
route. The route length replaces the TTL.
Theorem 3. Alg. 2 provides ε-DP with (2 ·ε
1
·
8 ·
R ·ln(2) ln
β
2LR

,β) error, where L is the length
of the sequence and R = |R |.
Remark 2. Alg. 2 can be adapted to run at the track-
ing points in a distributed fashion. Each tracking point
v V then requires access to the set R
max
(v) of max-
imal routes starting in v (or needs to re-compute this
set at every time-step on input G). This spreads out
the runtime across all points v V , but it remains ex-
ponential in G at each tracking point.
4.3 A Hybrid Approach
In this subsection, we study a hybrid and probabilistic
approach that bridges between algs. 1 and 2. Where
alg. 2 sampled noise per route at each time-step t and
then sends that noise along its corresponding route,
we now use a random number n
0
of noise values
that we sample at each location v V at each time-
step t. For each of those n noise values, a neighbor
v
Adj(v) of v is chosen uniformly at random and
the noise is passed along to v
. During step t + 1,
that noise value is again sent on to another neighbor
v
′′
Adj(v
) and so on, until T steps have elapsed. In
essence, the n noise values behave like n “ghost cars”
that perform a random walk on G. Since the noise
values travel along a random path, it is possible that a
given route is without noise at some time-step t. This
destroys DP, if left untreated. However, this event is
detectable and can be solved by falling back to alg. 1.
Alg. 3 implements this idea. Note that the mag-
nitudes of the parameter b of the Laplace distribu-
tion for the two kinds of noise differ substantially.
This algorithm, as defined, is based on the assump-
tion that no car takes a route shorter than T . It can be
adapted to work for the general case, by adding dedi-
cated ghost cars per possible route length 1,. .. ,T .
Theorem 4. Alg. 3 provides ε-DP, provided the adja-
cent sequences differ in a route of length T .
Alg. 3 can be implemented in a distributed way, by
transmitting the ghost cars to neighboring tracking-
points and reporting them alongside the regular loca-
tion reports. The tracking-points would not provide
DP by themselves, because the transmitted IDs differ
between adjacent sequences.
The noise bound is difficult to state and prove, be-
cause it is an amalgamation of the noise bounds for a
sum of Laplacians and the noise bound for alg. 1. We
instead simulate the algorithm below.
Algorithm 2: Tracking ID+location pairs with perroute
noise.
ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy
762
global : R database holding routes per ID
global : L database IDs and associated noise
input : f V
U
1 c 0
R
2 covered (,... ,)
3 forall u
F
,µ, r,l L
4 if l = 0 : DROP(L, u
F
) and continue
5 v Uniform[Adj(last(r))]
6 r
r v
7 UPDATE(L,u
F
,µ, r
,l 1)
8 c
r
c
r
+ µ and covered
r
9 forall v V
10 while continue with probability p
11 µ Lap
2
ε
12 INSERT(L, NEXTFREEID(),µ,v,T )
13 c
v
c
v
+ µ
14 covered
v
15 forall u f
1
(v)
16 r LOOKUP(R, u)v
17 r UPDATEORINSERT(R,u,r)
18 c
r
c
r
+ 1
19 forall r R with covered
r
̸=
20 c
r
c
r
+ Lap
2T
ε
output: c
5 COMPARISON
5.1 Simulation
It is clear that alg. 1 has better noise properties than
alg. 2. But when we consider alg. 3, the picture is
not so clear. This algorithm shares elements with
alg. 2, but spawns less noise on the first hops of a
route (depending on p). On the other hand, it period-
ically falls back to alg. 1. It is also not clear whether
the ghost cars spawned in alg. 3, which produce less
noise per car, will survive long enough to reduce the
overall noise. Note that if too many such cars are
spawned, their combined noise might dominate the
overall noise value and we converge to alg. 2. To in-
vestigate this, we implemented the hybrid approach
and alg. 1. We then compared these two approaches
by simulating each one for a total of m = 10000 times
using evenly spaced values for ε between 0.1 and 1.0.
We seeded our RNG with a pseudo-random seed cho-
sen as the SHA-256 hash of the string “ICISSP’24
Simulation” to produce reproducible yet unbiased re-
sults. The code for our evaluation can be found here.
We chose multiple values for p between 0.6 and 0.99.
Figure 2: Noise values (d = 3).
Choosing an appropriate T and studying the prob-
ability of a noise value staying alive along a route pose
a challenge. One can compute Pr[τ = i], if one as-
sumes a constant out-degree for every vertex in G.
Doing so, one finds that noise cars are rarely alive af-
ter time t = 8. Moreover, there is only a small differ-
ence between out-degree 2 and 3. We omit the details
and refer the reader to the full version of this paper
(Gelderie et al., ).
We then performed measurements using T = 10,
d = 3 and plotted the maximum and average absolute
noise for various values of p and ε (note that param-
eter p does not affect alg. 1). The results are shown
in fig. 2. A complete table of results can be found
in (Gelderie et al., ). We can see that the hybrid ap-
proach outperforms alg. 1 in terms of both maximum
and average noise. The benefit is present over the en-
tire range of ε values. Alg. 1 requires 1.33× the
average noise of the hybrid approach and 1.17×
the maximum noise. With increasing p, the hybrid
approach perform better. This is surprising: Large
p imply a large number of ghost cars carrying noise.
It seems that with the given parameters, p = 0.99 is
small enough for the benefits to outweigh the costs.
5.2 Computational Complexity
We close this section by comparing the algorithmic
properties of the three algorithms. We consider two
deployment scenarios: In the centralized scenario, the
algorithm runs completely in the context of the cura-
tor. The tracking behave as described in sec. 3. In
the distributed scenario, a part of the algorithm is
executed at the tracking points. These parts differ
between algorithms. We begin by defining how we
measure algorithmic complexity, which is tradition-
ally measured in terms of input size. In our case, this
might mean the input per time step, the overall design
parameters (e.g. the graph G or the duration T , or
both). We will focus on the design parameters, disre-
garding the input per time-step.
Algorithm 3: DP for location tracking
Hybrid Approach.
Differential Privacy for Distributed Traffic Monitoring in Smart Cities
763
The input per time-step is misleading: Consider
alg. 1 which has inputs of the form v R
U
. It runs
linearly in this vector. However note the vector first
needs to be built from the reports by individual track-
ing points (cf. sec. 3). On the other hand, algs. 2 and 3
compute the same vector internally from an input pro-
portional to the number P of IDs currently moving
through the in the city. Ultimately, both algorithms
need to process all routes, which is a function of the
number T and the graph G. Hence, we measure run-
time not in the input per time-step, but in the over-
all design parameters G = (V, E), T , and (in the case
of alg. 3) p. Usually we will use the proxy variable
|R |
T
i=1
|V |
i
|V |
T +1
. We will treat sampling a
Laplacian as constant cost. Likewise, we will treat all
database-lookups as constant costs.
Finally, it is easy to see that both algs. 1 and 2
are asymptotically linear in the number |R | of routes.
From an asymptotic perspective, the algorithms per-
form almost equally well. In our opinion, this un-
fairly hides the fact that alg. 2 needs to iterate over R
twice (or perform twice the work per loop iteration)
compared with alg. 1. We thus count loop iterations
in terms of the parameters laid out above, rather than
use asymptotic complexity.
Centralized. Alg. 1 run linearly in |R |. If we add
computing the statistic from per-vehicle reports to its
runtime, then it runs in time |R |+P (where again P is
the number of participants in the system at the given
time-step). Alg. 1 requires no storage; with comput-
ing the statistic from per-vehicle reports, some map-
ping of ID to route-prefixes is needed and we require
storage on the order of P.
Alg. 2 runs in time |R
max
|+ |R |+ P, where the
first term is to spawn one ghost-car per route, the sec-
ond is to add the noise to the route-counts and to prop-
agate the ghost-cars, and the third is to compute the
actual noise-counts from per-vehicle reports. Stor-
age is required to store both the ghost cars in the sys-
tem and the ID-to-route mapping per participant. This
means storage on the order of P + |R | is needed.
Finally, alg. 3 runs in time P + |R | +
T ·|V p
1p
.
Note that
p
1p
is the expected number of ghost-cars
spawned for each |V |. They remain in the system for
T rounds. These cars need to be propagated in ev-
ery step. Additionally, the P reports by “real” need
to be processed. Finally, we need to check for each
route, if noise was added to it and fall back to alg. 1
otherwise. Space is required to store the route (and
possibly noise) for all real and ghost cars, meaning
storage on the order of P +
|V T ·p
1p
.
Distributed. Only algs. 2 and 3 can be implemented
in a distributed fashion. Alg. 2 would offload creating
ghost noise per r R
max
to each v V . The tracking
points forward the ghost cars to each other, much like
real cars. The runtime per tracking point v V then is
P
v
+|R
max
(v)|, where P
v
is the number of participants
at v and R
max
(v) is the set of maximal length routes
beginning in v. Storage is required only to store the
set R
max
(v). The curator runs in time P + |R |.
Alg. 3 similarly offloads the generation of ghost
cars to the tracking points. These now run in time
at most
T ·|V p
1p
+ P
v
each. They might run signifi-
cantly faster on average, depending on the structure
of G: The
T ·|V p
1p
ghost cars will distribute over G
in general, but not necessarily uniformly (vertices re-
ceive ghost cars proportional to their in-degree in ev-
ery time-step). The tracking points need only store
a representation of Adj(v). The curator is as above.
While overall storage seems lower, the actual noise-
to-ID binding now reside on network link buffers.
Discussion. We see that while alg. 2 is optimal in
terms of noise, it has worst complexity in the cen-
tralized setting and the distributed setting. Alg. 1 on
the other hand cannot be implemented in a distributed
fashion and incurs a large amount of noise. We can
see that alg. 3 strikes a balance between both algo-
rithms in terms of runtime and space both in the cen-
tralized and distributed settings provided p is chosen
appropriately. The previous subsection showed that it
can outperform alg. 1 in terms of noise to some extent.
6 CONCLUSION
We have introduced a model for decentralized traffic
monitoring in the smart city based on vehicle track-
ing. We then introduced a notion of adjacency that fits
this model and permits us to study ε DP in this con-
text. Building on that, we presented three algorithms
that each achieve ε DP in our setting. Each algorithm
has unique advantages and disadvantages in terms of
noise, runtime, and their ability to be deployed in a
distributed fashion. Together, they showcase the var-
ious engineering tradeoffs that a practioner might en-
counter when applying this model to a specific city.
Our first algorithm is simple to implement, yet in-
curs high noise as it requires Laplace noise scaled to
T . Moreover, it cannot be run in a distributed fash-
ion. Remarkably, despite the superficially similar set-
ting, the well-known binary tree technique cannot be
ported to this setting. We next showed that a depen-
dence on T in terms of noise is altogether unneces-
sary. The resulting algorithm can run in a distributed
fashion, but incurs high runtime overhead. Our third
algorithm is a hybrid approach striking a balance be-
ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy
764
tween the first two algorithms. It is reasonably effi-
cient in a distributed setting and outperforms the first
algorithm in terms of noise when simulated.
We believe that our results show the merit of hy-
bridizing DP algorithms in a probabilistic way, specif-
ically when working in the presented context of traffic
monitoring. While the resulting algorithms are more
difficult to analyze, they perform reasonably well. We
believe that future work can improve this situation.
For example, hybrid schemes could adapt to the topol-
ogy of the graph, covering routes that have a higher
probability of “loosing” a ghost with a higher number
of the same. Finally, the benefits of adopting (ε, δ)
DP, where δ > 0, may hold significant improvements
in terms of noise at the cost of a modest imbalance in
the privacy guarantees.
REFERENCES
Bhardwaj, V., Rasamsetti, Y., and Valsan, V. (2022). Traffic
control system for smart city using image processing.
AI and IoT for Smart City applications, pages 83–99.
Cardoso, A. R. and Rogers, R. (2022). Differentially pri-
vate histograms under continual observation: Stream-
ing selection into the unknown. In International Con-
ference on Artificial Intelligence and Statistics, pages
2397–2419. PMLR.
Chan, T.-H. H., Shi, E., and Song, D. (2011). Private and
continual release of statistics. ACM TISSEC, 14(3):1–
24.
Chen, Q., Ni, Z., Zhu, X., and Xia, P. (2023). Differen-
tial privacy histogram publishing method based on dy-
namic sliding window. Frontiers of Computer Science,
17(4):174809.
Djahel, S., Doolan, R., Muntean, G.-M., and Murphy, J.
(2015). A communications-oriented perspective on
traffic management systems for smart cities: Chal-
lenges and innovative approaches. IEEE Communi-
cations Surveys & Tutorials, 17(1):125–151.
Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006).
Calibrating noise to sensitivity in private data analysis.
In Theory of Cryptography, pages 265–284. Springer.
Dwork, C., Naor, M., Pitassi, T., and Rothblum, G. N.
(2010). Differential privacy under continual observa-
tion. In Proceedings of the Forty-Second ACM Sym-
posium on Theory of Computing, STOC ’10, page
715–724, New York, NY, USA. Association for Com-
puting Machinery.
Dwork, C., Naor, M., Reingold, O., and Rothblum, G. N.
(2015). Pure differential privacy for rectangle queries
via private partitions. In International Conference on
the Theory and Application of Cryptology and Infor-
mation Security, pages 735–751. Springer.
Dwork, C., Roth, A., et al. (2014). The algorithmic founda-
tions of differential privacy. Foundations and Trends®
in Theoretical Computer Science, 9(3–4):211–407.
Gade, D. (2019). Ict based smart traffic management sys-
tem “ismart” for smart cities. International Journal of
Recent Technology and Engineering, 8(3):1000–1006.
Gelderie, M., Luff, M., and Brodschlem, L. Differential pri-
vacy for distributed traffic monitoring in smart cities
(full version).
Gracias, J. S., Parnell, G. S., Specking, E., Pohl, E. A., and
Buchanan, R. (2023). Smart cities—a structured liter-
ature review. Smart Cities, 6(4):1719–1743.
Hassan, M. U., Rehmani, M. H., and Chen, J. (2019). Dif-
ferential privacy techniques for cyber physical sys-
tems: a survey. IEEE Communications Surveys & Tu-
torials, 22(1):746–789.
Henzinger, M., Sricharan, A., and Steiner, T. A. (2023).
Differentially private data structures under continual
observation for histograms and related queries. arXiv
preprint arXiv:2302.11341.
Husnoo, M. A., Anwar, A., Chakrabortty, R. K., Doss, R.,
and Ryan, M. J. (2021). Differential privacy for iot-
enabled critical infrastructure: A comprehensive sur-
vey. IEEE Access, 9:153276–153304.
Jain, P., Raskhodnikova, S., Sivakumar, S., and Smith, A.
(2023). The price of differential privacy under contin-
ual observation. In International Conference on Ma-
chine Learning, pages 14654–14678. PMLR.
Khanna, A., Goyal, R., Verma, M., and Joshi, D. (2019).
Intelligent traffic management system for smart cities.
In Futuristic Trends in Network and Communication
Technologies, pages 152–164. Springer Singapore.
Kumar, A., Upadhyay, A., Mishra, N., Nath, S., Yadav,
K. R., and Sharma, G. (2022). Privacy and security
concerns in edge computing-based smart cities. In
Robotics and AI for Cybersecurity and Critical Infras-
tructure in Smart Cities, pages 89–110. Springer.
Li, Y., Zhang, P., and Wang, Y. (2018). The location privacy
protection of electric vehicles with differential privacy
in v2g networks. Energies, 11(10):2625.
Ma, Z., Zhang, T., Liu, X., Li, X., and Ren, K. (2019). Real-
time privacy-preserving data release over vehicle tra-
jectory. IEEE transactions on vehicular technology,
68(8):8091–8102.
Qu, Y., Nosouhi, M. R., Cui, L., and Yu, S. (2019). Privacy
preservation in smart cities. In Smart cities cyberse-
curity and privacy, pages 75–88. Elsevier.
Rizwan, P., Suresh, K., and Babu, M. R. (2016). Real-time
smart traffic management system for smart cities by
using internet of things and big data. In 2016 Interna-
tional Conference on Emerging Technological Trends.
Sun, Y.-E., Huang, H., Yang, W., Chen, S., and Du, Y.
(2021). Toward differential privacy for traffic mea-
surement in vehicular cyber-physical systems. IEEE
Transactions on Industrial Informatics, 18(6):4078–
4087.
Yao, A., Li, G., Li, X., Jiang, F., Xu, J., and Liu, X. (2023).
Differential privacy in edge computing-based smart
city applications: Security issues, solutions and future
directions. Array, page 100293.
Zhou, Z., Qiao, Y., Zhu, L., Guan, J., Liu, Y., and Xu,
C. (2018). Differential privacy-guaranteed trajec-
tory community identification over vehicle ad-hoc net-
works. Internet Technology Letters, 1(3):e9.
Differential Privacy for Distributed Traffic Monitoring in Smart Cities
765