Differential Privacy for Distributed Trafﬁc Monitoring in Smart Cities

Marcus Gelderie

, Maximilian Luff and Lukas Brodschelm

Aalen Univeristy of Applied Sciences, Beethovenstr. 1,73430 Aalen, Germany

{ﬁrstname.lastname}@hs-aalen.de

Keywords:

Differential Privacy,Smart City,Trafﬁc Monitoring.

Abstract:

We study differential privacy in the context of gathering real-time congestion of entire routes in smart cities.

Gathering this data is a distributed task that poses unique algorithmic and privacy challenges. We introduce

a model of distributed trafﬁc monitoring and deﬁne a notion of adjacency for this setting that allows us to

employ differential privacy under continual observation. We then introduce and analyze three algorithms that

ensure ε differential privacy in this context. First we introduce two algorithms that are built on top of existing

algorithmic foundations, and show how they are suboptimal in terms of noise or complexity. We focus, in

particular, on whether algorithms can be deployed in our distributed setting. Next, we introduce a novel

hybrid scheme that aims to bridge between the ﬁrst two approaches, retaining an improved computational

complexity and a decent noise level. We simulate this algorithm and demonstrate its performance in terms of

noise.

1 INTRODUCTION

Smart cities are an ongoing trend in large urban areas,

where communities seek to leverage data analytics in

order to optimize aspects of their infrastructure. One

prominent example of this is smart trafﬁc manage-

ment (Gade, 2019; Bhardwaj et al., 2022). The goal

is to minimize congestion and reduce overall point-

to-point travel time. Solutions in this context rely

on live data to predict movements and react accord-

ingly. This usually requires tracking vehicles moving

about the city to discern movement patterns that span

large parts of (or even the entire) city (e.g., see (Dja-

hel et al., 2015; Khanna et al., 2019; Rizwan et al.,

2016)).

From a privacy perspective, however, tracking in-

dividual citizens day and night, possibly storing this

data at a central location, is, of course, a nightmare.

Local legislation (e.g. the GDPR in the EU) may

even prohibit some of those solutions, threatening the

adoption of modern trafﬁc management. Legal risks

aside, massive data collection at a centralized loca-

tion poses signiﬁcant risks from a information secu-

rity perspective (Gracias et al., 2023). Ultimately,

cities need solutions that minimize the sensitivity of

the data that is stored and reduce the privacy risks to

affected individuals.

Differential Privacy (DP) (Dwork et al., 2006) is a

https://orcid.org/0009-0003-0291-3911

well-known tool to design algorithms that give quan-

tiﬁable privacy guarantees. Much research has gone

into developing DP algorithms for various statistical

tasks, such as counting, summing, top-k queries and

the like (Dwork et al., 2010; Dwork et al., 2015; Chan

et al., 2011; Henzinger et al., 2023) (see also Related

Work below). DP has also been applied to trafﬁc and

vehicle data analysis in the past (Hassan et al., 2019;

Ma et al., 2019; Zhou et al., 2018; Li et al., 2018; Sun

et al., 2021). However the monitoring of city-scale

point-to-point trafﬁc movements centrally has, to our

knowledge, not been considered before, even though

it has been identiﬁed as a relevant research topic (Has-

san et al., 2019).

We consider the task of monitoring movements of

individual vehicles at locally distinct points through-

out a city and aggregating that data into a central

statistic that captures the number of vehicles traveling

along a set of routes within the city limits. We pro-

pose three different algorithms that provide ε DP in

this setting and compare their relative merits. Specif-

ically, we show how there appears to be a trade-off

between the noise incurred and the complexity of the

algorithms that run centrally or in a distributed way.

Our contributions are as follows: I) We propose

a generic architecture for distributed trafﬁc monitor-

ing that is applicable in multiple scenarios. II) We

develop three DP algorithms for this architecture and

analyze their relative noise levels. III) We provide an

758

Gelderie, M., Luff, M. and Brodschelm, L.

Differential Privacy for Distributed Trafﬁc Monitoring in Smart Cities.

DOI: 10.5220/0012372700003648

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 10th International Conference on Information Systems Security and Privacy (ICISSP 2024), pages 758-765

ISBN: 978-989-758-683-5; ISSN: 2184-4356

analysis of the algorithmic properties of our three al-

gorithms, where the input is the size of the city, and

the length of the monitoring period. We pay particular

attention to whether algorithms support a distributed

deployment at the various locations across the city.

Our three algorithms are designed to showcase the

engineering trade-offs that impact the noise level and

algorithmic properties. Much depends on the input

format and the notion of adjacency that is considered.

For example, counting, as a primitive, has been

studied extensively (Dwork et al., 2010) form a DP

perspective. The “binary tree technique” introduced

in (Dwork et al., 2010) and later studied in (Chan

et al., 2011; Henzinger et al., 2023) yields log(T )

noise, where T is the duration of the counting task.

But this technique cannot easily be ported to our set-

ting. Instead, the naive option of sampling noise per

time-step (cf. alg. 1) outperforms any attempt to port

the binary tree technique to our setting in terms of

noise. However, it cannot be meaningfully deployed

in a distributed way.

Next we introduce alg. 2, which has a noise bound

linear in the number of tracked routes. Note the num-

ber of routes itself is exponential in the duration T of

tracking, i.e. R ≤ V

T +1

, where V is the number of

vertices. However, the resulting algorithm can be de-

ployed in a distributed way. Its runtime is also linear

in the number of routes, both if deployed centrally or

in a distributed way.

Finally, we propose a third probabilistic hybrid

scheme (cf. alg. 3) that bridges between these two

approaches. We analyze and simulate this third ap-

proach and ﬁnd that it strikes a balance between both

the “naive” (alg. 1) and the “noise per route” approach

(alg. 2) in both runtime and noise. It is also easily de-

ployable in a distributed way.

Related Work. Differential Privacy (DP) was intro-

duced ﬁrst introduced in (Dwork et al., 2006). Sub-

sequently, several algorithms giving (ε,0)- and (ε,δ)-

DP queries were introduced (see also (Dwork et al.,

2014)). However, since many applications require the

continual release of statistics, the notion of continual

observation was introduced and has since been stud-

ied extensively (Dwork et al., 2010; Chan et al., 2011;

Henzinger et al., 2023). As a part of this, the notion of

adjacency of input sequences, speciﬁcally user-level

and event-level privacy, was introduced. User-level

privacy requires the continual mechanism to be DP-

private independent of how often a individual partic-

ipates in the accumulated statistical data. This con-

tinual counting mechanism has since been improved

(Dwork et al., 2015).

Besides counting events, histogram queries are of

interest, and have been studied. For instance, (Hen-

zinger et al., 2023) considered an intermediate DP

histogram in order to continuously release differen-

tially private max-sum, top-k-, and histogram queries.

For queries like max-sum or sum-select upper and

lower bounds on the accuracy have been established

(Jain et al., 2023). Furthermore, there has been re-

search into settings where the sensitivity between his-

togram queries is limited but the domain from which

the items for the histogram are sampled is unknown

(Cardoso and Rogers, 2022). There is also some prior

work to calculate dynamic sliding windows to mini-

mize the error bounds of such algorithms (Chen et al.,

2023).

However, none of these works ﬁt our use case of

continuously releasing trafﬁc statistics that track ve-

hicles through a city. In this setting, counters are

not monotonic and many of those algorithms cannot

be applied. A dynamically generated sliding win-

dow size is not feasible, since we have to reliably

get updated counts for different routes each phase.

Lastly, our adjacency notion differs signiﬁcantly from

the previous deﬁnitions (cf. sec. 3), presenting a chal-

lenge to porting previous algorithms.

Although differential privacy is a well researched

ﬁeld, smart cities as a possible application pose many

distinct challenges (Yao et al., 2023; Husnoo et al.,

2021). Privacy is one of the main factors that should

be kept in mind when designing smart city systems

(Kumar et al., 2022). And (Qu et al., 2019) already

stated that smart mobility in particular is one of the

main factors for security concerns, since the attacker

is able to learn locations of any single individual.

There are related works studying DP in scenarios like

vehicle-2-X communications in electric-charging or

vehicle trajectory estimation, such as (Li et al., 2018;

Ma et al., 2019). Those works do not focus on vehi-

cle tracking for the purpose of optimizing city trafﬁc.

(Sun et al., 2021) study trafﬁc volume measurement

and present an estimator that is DP under certain con-

ditions. The focus is on estimating trafﬁc volume for

a given set of locations. In this paper, we study traf-

ﬁc statistics for speciﬁc routes (ordered sequences of

locations) within a city.

2 PRELIMINARIES

Notation. For a set X , write X

∗

for all ﬁnite (possi-

bly empty) sequences of elements of X. For s ∈ X

∗

write s = s

···s

where s

∈ X, 1 ≤ i ≤ l. The length

of s is |s| = l. A preﬁx is a sequence s|

= s

···s

where 0 ≤ i ≤ |s|. The concatenation s = s

···s

and

′

= s

′

···s

′

is written as s ∥s

′

= s

···s

′

···s

′

. We

Differential Privacy for Distributed Trafﬁc Monitoring in Smart Cities

759

write [a,b]

def

= [a,b] ∩ and drop the subscript when

the need for naturals numbers is clear.

Differential Privacy. Differential privacy (DP) was

introduced by (Dwork et al., 2006) for single

databases (see (Dwork et al., 2014) for a comprehen-

sive introduction) and later adapted to the continual

setting (Dwork et al., 2010). Here one considers “ad-

jacent” sequences of input. Several notions of adja-

cency exist (in particular event level and user level

adjacency). We postpone the precise deﬁnition of ad-

jacency to sec. 3, where we will discuss why existing

deﬁnitions of adjacency do not ﬁt our use-case well

and propose a more tailored deﬁnition. The follow-

ing deﬁnition of DP under continual observation is

adapted

from (Dwork et al., 2010):

Deﬁnition 1. Let ε > 0. Let A be a randomized al-

gorithm, called the curator, that on input sequence

,. .. ,s

, produces an output sequence A(s) ∈ Σ

the same length.

A provides ε-differential privacy (ε-DP), if for all

adjacent input streams s, s

′

and all S ⊆Σ

∗

Pr[A(s) ∈ S] ≤ exp(ε)Pr[A(s

′

) ∈ S]

The following useful “post-processing” theorem

allows rounding outputs to the nearest integer without

loosing DP. We thus consider only algorithms produc-

ing real numbers in this paper.

Theorem 1 (see (Dwork et al., 2014)). If A provides

ε-DP and B is any randomized mapping deﬁned on

range(A), then B ◦A provides ε-DP.

As usual, A has (α,β) error, if for any s of length

L, the maximal difference (in the ∞-norm) between

the true statistic f (s) and the computed statistic A(s)

is below α with probability at least 1 −β.

The Laplace distribution Lap(b) of scale b > 0

has PDF p(x) = (2b)

−1

exp(−|x|·b

−1

) and satisﬁes

p(x) ≤ exp





·p(x ±1).

3 PROBLEM STATEMENT

We consider trafﬁc monitoring in a smart city con-

text. Our goal is to compute the number of of vehicles

travelling along a certain route in a given time-period.

To this end, vehicles are recorded at speciﬁc track-

ing points in the city, such as at trafﬁc lights. This

information is aggregated centrally into one statistic

about the number of vehicles traveling along a spe-

ciﬁc route. The overall situation is depicted in ﬁg. 1.

Different from the original deﬁnition, we do not con-

sider internal states and pan-privacy. We are only interested

in differentially private outputs.

Figure 1: System architecture.

Tracking vehicles requires two pieces of auxil-

iary data. Once a vehicle is detected, it is assigned

a unique ID u from a countable set U of possible IDs

and a time-to-live (TTL) that is initialized to a con-

stant T ∈ . Vehicles are recognized via some for of

identifying information (e.g. license plate). For pri-

vacy reasons, we hide this information behind a ran-

domized unique ID. The TTL is decremented at each

tracking point. Once it reaches zero, the vehicle is

no longer tracked. If the same vehicle continues to

move through the city, it will instead by assigned a

new unique ID u

′

̸= u and a new TTL. In fact, the new

unique ID will satisfy the stronger property that it is

distinct from any previously used ID. We will elabo-

rate on why this is necessary further below.

We model the city as a directed graph G =

(V,E). The vertices correspond to the tracking points.

A route is a path r = r

···r

through the graph:

i+1

) ∈ E for 1 ≤ i < m (repetitions are possible).

Note m ≤T , because of the TTL. We write R for the

(ﬁnite) set of all routes and R

max

= {r ∈R | |r| = T }.

For each observed vehicle, the tracking points

transmit its TTL to neighboring locations Adj(v) =

′

∈ V | (v,v

′

) ∈ E}, so that the same vehicle is al-

ways assigned the same unique ID (within its TTL). In

this way, each tracking point reports a sequence of ob-

served IDs to the central statistics server (see ﬁg. 1).

Using the resulting pairs of ID and location, the server

builds a statistic of vehicles per route.

Discussion. The masking of license plates by

unique IDs already provides some privacy, but this is

difﬁcult to quantify. A vehicle on a very low-trafﬁc

route may still be identiﬁable. DP provides more ro-

bust and quantiﬁable guarantees in this situation.

In practice, it is possible that vehicles take a long

time to travel from v to v

′

, even for adjacent (v,v

′

) ∈E

(e.g. if the driver stops for coffee). The resulting se-

quence for that ID will contain “gaps”: intermittent

time-steps where the vehicle is not recorded. In the

ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy

760

remainder of this paper we assume that every vehicle

is recorded at every time index until it stops moving

or its TTL elapses; i.e. there are no “gaps”. This

is no limitation, because such gaps will not change

how our algorithms work. But accounting for these

corner-cases in the mathematical notation below is te-

dious while adding little value. In a similar vein, we

assume that all tracking points report their data simul-

taneously. Again, this is not a realistic assumption.

But similar to our “no gaps” assumption, it simpliﬁes

proofs while not affecting our algorithms.

4 DP IN TRAFFIC MONITORING

In this section we propose three different algorithms

that ensure DP trafﬁc monitoring. The algorithms

work on input sequences of different types. How-

ever, all such input sequences are (partial) functions

from unique IDs to some domain D (either R or V).

Speciﬁcally, all algorithms studied in this paper pro-

cess sequences of the form s = (s

,. .. ,s

), where for

each 1 ≤i ≤ L s

= (s

i,u

)

u∈U

∈ D

for some domain

D. Note we sometimes use vector notation (s

)

u∈U

with s

∈ D instead of functional notation s(u) ∈ D.

We stress that these vectors or functions can be par-

tial. In such cases, we write s

= ⊥ if u /∈dom(s) and

require that ⊥ /∈ D.

Differential privacy in continual settings (i.e.

where sequences of events are processed) has been

studied before (e.g. (Dwork et al., 2010; Jain et al.,

2023; Henzinger et al., 2023; Cardoso and Rogers,

2022; Chan et al., 2011)). However, our case is subtly

different. To illustrate, we recall the following deﬁni-

tion of DP from (Dwork et al., 2010):

Deﬁnition 2 (Adjacency). Let X be some set of

events, L ∈ and s = (s

,. .. ,s

), s

′

= (s

′

,. .. ,s

′

) ∈

. Then s,s

′

are adjacent if there exists some sub-

set I ⊆ {1,. .. ,L} and x, x

′

∈ X , such that s

′

= s

for

all i /∈ I and s

′

= x

′

for all i ∈ I if s

= x.

This deﬁnition does not ﬁt our case well, because

it is restricted to two ﬁxed symbols x,x

′

that are being

exchanged. Simply removing a single tracking point

is clearly not enough to hide the presence of a given

individual. Instead, we would like to remove a spe-

ciﬁc unique ID from the entire sequence (or alter its

route), which requires changing the value of several

functions s ∈D

at one point u ∈U. We therefore in-

troduce a slightly adapted notion of adjacency. Given

any function f : A → B, a ∈A and b ∈ B, write f [a/b]

for the function f [a/b](x) = f (x) for all x ̸= a and

f [a/b](a) = b. We now deﬁne:

Deﬁnition 3 (ID Adjacency). Let X = D

, L ∈

and s = (s

,. .. ,s

), s

′

= (s

′

,. .. ,s

′

) ∈ X

. Then

s,s

′

are ID adjacent if there exists some subset I ⊆

{1,. .. ,L}, u ∈ U and d ∈ D ∪{⊥}, such that s

′

= s

for all i /∈ I and s

′

= s

[u/d] for all i ∈ I.

Note that d = ⊥ is possible, effectively dropping

(some) occurrences of u in the sequence.

In this paper we study only ID adjacency. There-

fore, whenever we use the word “adjacent” in what

follows, we mean ID adjacency.

We can now understand why we assume that IDs

from U are never reassigned. This assumption al-

lows us to deﬁne adjacency without considering cor-

ner cases, such as whether the occurrences of an

ID in s and s

′

overlap, and without deﬁning what it

means for repeated occurrences of the same ID to be

“causally connected”. In practice, any set of IDs can

be made inﬁnite with no risk of reassigning by taking

the Cartesian product with the set of all timestamps.

Remark 1. In this paper we consider algorithms that

process input sequences of unbounded length. Be-

cause of the bounded TTL, differences in two ID ad-

jacent sequences will always affect at most T distinct

time steps of the execution of the algorithm

4.1 Route Counting

In this section we study an approach that processes

(sequences of) mappings of IDs to routes and sim-

ply counts the number of IDs per route. The actual

tracking of IDs along routes is done as a preprocess-

ing step, before the curator is fed the actual data: The

curator works as a post-processing step.

In the event that the input to the curator is

a (partial) mapping of unique IDs to routes, the

statistics function is particularly simple: Given

v ∈ R

and r ∈ R write m

(v) = |{u ∈ U |

= r}| for the number of IDs that v maps to

r. If we enumerate R = {r

,. .. ,r

}, we can

write the statistics function f : (R

)

∗

→ (

)

∗

as f (s

,. .. ,s

) = ¯m(s

),. .. , ¯m(s

) ∈





where

¯m(s

) = (m

),. .. ,m

)) ∈

, 1 ≤ i ≤ L. To

simplify notation, we write f

t,r

def

= (( f (s))

)

for 1 ≤

t ≤ L and r ∈ R in the remainder of this paper.

The simplest solution to create DP in this setting

is to add noise per reported route. Our setting might

seem similar to event counting (Dwork et al., 2010),

and so one might expect that the log

(T ) noise bound

from the binary tree mechanism introduced in (Dwork

et al., 2010) carries over to our setting. This seems

not to be the case! Due to space constraints, we refer

the reader to the full version of this paper (Gelderie

et al., ) for a justiﬁcation of this claim. As a result, our

ﬁrst algorithm, relies on the straightforward method

of adding independent noise per time step. This is

Differential Privacy for Distributed Trafﬁc Monitoring in Smart Cities

761

input : vector v = (r

)

u∈U

1 c ←0 ∈

2 for r ∈R

3 c

← m

(v) + Lap(2 ·ε

−1

·T )

output: c

Algorithm 1: DP on pre-aggregated routes.

global : R database holding routes per ID

global : L database IDs and associated noise

input : f ∈V

1 c ←0 ∈

2 forall ⟨u

,µ, r⟩ ∈ L

3 decompose r = v ∥r

′

with v ∈V

4 if |r

′

| = 0 : DROP(L, u

)

5 else UPDATE(L, u

,µ, r

′

)

6 c

← c

+ µ

7 forall v ∈V

8 forall routes r = v∥r

′

∈ R starting at v

9 µ ← Lap





10 INSERT(L, NEXTFREEID(),µ,r

′

)

11 c

← c

+ µ

12 forall u ∈ f

−1

(v)

13 r ← LOOKUP(R, u)∥v

14 c

← c

+ 1

15 r ← UPDATEORINSERT(R,u,r)

output: c

shown in alg. 1.

Theorem 2. Alg. 1 gives ε DP with (ε

−1

2T ln(

),β) error on input sequences of length at

most L tracking R = |R | routes.

We omit this an all further proofs due to space

constraints, and refer the reader to the full version of

this paper (Gelderie et al., ).

4.2 Location Tracking

Now we study two algorithms that process sequences

of vectors of locations (elements of V ). The inputs

are now sequences of partial functions s = (s

) ∈V

To compute the same statistic, f : (V

)

∗

→

(

)

∗

now tracks IDs across time-steps. Formally,

let s be a sequence of length L, let 1 ≤ k ≤ L and

r ∈ R . Let c

(s|

) = |{u ∈U | ∃t ∈ [0,T −1]

k−t

(u)···s

(u) = r ∧∀t

′

< k −t : s

′

(u) = ⊥}|. We

require that u does not occur in s|

prior to time k −t

(so, given k, the value for t is maximal). The statistics

function is f (s) = (c

(s|

))

r∈R

,··· ,(c

(s|

))

r∈R

Alg. 2 computes f and adds noise. To this end,

at each time t and for every location v ∈V , noise µ is

sampled from Lap(2·ε

−1

) for each route r that begins

in v. This noise is assigned an ID and sent along that

route. The route length replaces the TTL.

Theorem 3. Alg. 2 provides ε-DP with (2 ·ε

−1

√

8 ·



R ·ln(2) −ln



2LR



,β) error, where L is the length

of the sequence and R = |R |.

Remark 2. Alg. 2 can be adapted to run at the track-

ing points in a distributed fashion. Each tracking point

v ∈V then requires access to the set R

max

(v) of max-

imal routes starting in v (or needs to re-compute this

set at every time-step on input G). This spreads out

the runtime across all points v ∈V , but it remains ex-

ponential in G at each tracking point.

4.3 A Hybrid Approach

In this subsection, we study a hybrid and probabilistic

approach that bridges between algs. 1 and 2. Where

alg. 2 sampled noise per route at each time-step t and

then sends that noise along its corresponding route,

we now use a random number n ∈

of noise values

that we sample at each location v ∈ V at each time-

step t. For each of those n noise values, a neighbor

′

∈ Adj(v) of v is chosen uniformly at random and

the noise is passed along to v

′

. During step t + 1,

that noise value is again sent on to another neighbor

′′

∈ Adj(v

′

) and so on, until T steps have elapsed. In

essence, the n noise values behave like n “ghost cars”

that perform a random walk on G. Since the noise

values travel along a random path, it is possible that a

given route is without noise at some time-step t. This

destroys DP, if left untreated. However, this event is

detectable and can be solved by falling back to alg. 1.

Alg. 3 implements this idea. Note that the mag-

nitudes of the parameter b of the Laplace distribu-

tion for the two kinds of noise differ substantially.

This algorithm, as deﬁned, is based on the assump-

tion that no car takes a route shorter than T . It can be

adapted to work for the general case, by adding dedi-

cated ghost cars per possible route length 1,. .. ,T .

Theorem 4. Alg. 3 provides ε-DP, provided the adja-

cent sequences differ in a route of length T .

Alg. 3 can be implemented in a distributed way, by

transmitting the ghost cars to neighboring tracking-

points and reporting them alongside the regular loca-

tion reports. The tracking-points would not provide

DP by themselves, because the transmitted IDs differ

between adjacent sequences.

The noise bound is difﬁcult to state and prove, be-

cause it is an amalgamation of the noise bounds for a

sum of Laplacians and the noise bound for alg. 1. We

instead simulate the algorithm below.

Algorithm 2: Tracking ID+location pairs with perroute

noise.

ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy

762

global : R database holding routes per ID

global : L database IDs and associated noise

input : f ∈V

1 c ←0 ∈

2 covered ← (⊥,... ,⊥)

3 forall ⟨u

,µ, r,l⟩ ∈ L

4 if l = 0 : DROP(L, u

) and continue

5 v ← Uniform[Adj(last(r))]

6 r

′

← r ∥v

7 UPDATE(L,u

,µ, r

′

,l −1)

8 c

′

← c

′

+ µ and covered

′

← ✓

9 forall v ∈V

10 while continue with probability p

11 µ ← Lap





12 INSERT(L, NEXTFREEID(),µ,v,T )

13 c

← c

+ µ

14 covered

← ✓

15 forall u ∈ f

−1

(v)

16 r ← LOOKUP(R, u)∥v

17 r ← UPDATEORINSERT(R,u,r)

18 c

← c

+ 1

19 forall r ∈R with covered

̸= ✓

20 c

← c

+ Lap





output: c

5 COMPARISON

5.1 Simulation

It is clear that alg. 1 has better noise properties than

alg. 2. But when we consider alg. 3, the picture is

not so clear. This algorithm shares elements with

alg. 2, but spawns less noise on the ﬁrst hops of a

route (depending on p). On the other hand, it period-

ically falls back to alg. 1. It is also not clear whether

the ghost cars spawned in alg. 3, which produce less

noise per car, will survive long enough to reduce the

overall noise. Note that if too many such cars are

spawned, their combined noise might dominate the

overall noise value and we converge to alg. 2. To in-

vestigate this, we implemented the hybrid approach

and alg. 1. We then compared these two approaches

by simulating each one for a total of m = 10000 times

using evenly spaced values for ε between 0.1 and 1.0.

We seeded our RNG with a pseudo-random seed cho-

sen as the SHA-256 hash of the string “ICISSP’24

Simulation” to produce reproducible yet unbiased re-

sults. The code for our evaluation can be found here.

We chose multiple values for p between 0.6 and 0.99.

Figure 2: Noise values (d = 3).

Choosing an appropriate T and studying the prob-

ability of a noise value staying alive along a route pose

a challenge. One can compute Pr[τ = i], if one as-

sumes a constant out-degree for every vertex in G.

Doing so, one ﬁnds that noise cars are rarely alive af-

ter time t = 8. Moreover, there is only a small differ-

ence between out-degree 2 and 3. We omit the details

and refer the reader to the full version of this paper

(Gelderie et al., ).

We then performed measurements using T = 10,

d = 3 and plotted the maximum and average absolute

noise for various values of p and ε (note that param-

eter p does not affect alg. 1). The results are shown

in ﬁg. 2. A complete table of results can be found

in (Gelderie et al., ). We can see that the hybrid ap-

proach outperforms alg. 1 in terms of both maximum

and average noise. The beneﬁt is present over the en-

tire range of ε values. Alg. 1 requires ≈ 1.33× the

average noise of the hybrid approach and ≈ 1.17×

the maximum noise. With increasing p, the hybrid

approach perform better. This is surprising: Large

p imply a large number of ghost cars carrying noise.

It seems that with the given parameters, p = 0.99 is

small enough for the beneﬁts to outweigh the costs.

5.2 Computational Complexity

We close this section by comparing the algorithmic

properties of the three algorithms. We consider two

deployment scenarios: In the centralized scenario, the

algorithm runs completely in the context of the cura-

tor. The tracking behave as described in sec. 3. In

the distributed scenario, a part of the algorithm is

executed at the tracking points. These parts differ

between algorithms. We begin by deﬁning how we

measure algorithmic complexity, which is tradition-

ally measured in terms of input size. In our case, this

might mean the input per time step, the overall design

parameters (e.g. the graph G or the duration T , or

both). We will focus on the design parameters, disre-

garding the input per time-step.

Algorithm 3: DP for location tracking –

Hybrid Approach.

Differential Privacy for Distributed Trafﬁc Monitoring in Smart Cities

763

The input per time-step is misleading: Consider

alg. 1 which has inputs of the form v ∈ R

. It runs

linearly in this vector. However note the vector ﬁrst

needs to be built from the reports by individual track-

ing points (cf. sec. 3). On the other hand, algs. 2 and 3

compute the same vector internally from an input pro-

portional to the number P of IDs currently moving

through the in the city. Ultimately, both algorithms

need to process all routes, which is a function of the

number T and the graph G. Hence, we measure run-

time not in the input per time-step, but in the over-

all design parameters G = (V, E), T , and (in the case

of alg. 3) p. Usually we will use the proxy variable

|R | ≤

∑

i=1

|V |

≤ |V |

T +1

. We will treat sampling a

Laplacian as constant cost. Likewise, we will treat all

database-lookups as constant costs.

Finally, it is easy to see that both algs. 1 and 2

are asymptotically linear in the number |R | of routes.

From an asymptotic perspective, the algorithms per-

form almost equally well. In our opinion, this un-

fairly hides the fact that alg. 2 needs to iterate over R

twice (or perform twice the work per loop iteration)

compared with alg. 1. We thus count loop iterations

in terms of the parameters laid out above, rather than

use asymptotic complexity.

Centralized. Alg. 1 run linearly in |R |. If we add

computing the statistic from per-vehicle reports to its

runtime, then it runs in time |R |+P (where again P is

the number of participants in the system at the given

time-step). Alg. 1 requires no storage; with comput-

ing the statistic from per-vehicle reports, some map-

ping of ID to route-preﬁxes is needed and we require

storage on the order of P.

Alg. 2 runs in time |R

max

|+ |R |+ P, where the

ﬁrst term is to spawn one ghost-car per route, the sec-

ond is to add the noise to the route-counts and to prop-

agate the ghost-cars, and the third is to compute the

actual noise-counts from per-vehicle reports. Stor-

age is required to store both the ghost cars in the sys-

tem and the ID-to-route mapping per participant. This

means storage on the order of P + |R | is needed.

Finally, alg. 3 runs in time P + |R | +

T ·|V |·p

1−p

Note that

1−p

is the expected number of ghost-cars

spawned for each |V |. They remain in the system for

T rounds. These cars need to be propagated in ev-

ery step. Additionally, the P reports by “real” need

to be processed. Finally, we need to check for each

route, if noise was added to it and fall back to alg. 1

otherwise. Space is required to store the route (and

possibly noise) for all real and ghost cars, meaning

storage on the order of P +

|V |·T ·p

1−p

Distributed. Only algs. 2 and 3 can be implemented

in a distributed fashion. Alg. 2 would ofﬂoad creating

ghost noise per r ∈ R

max

to each v ∈V . The tracking

points forward the ghost cars to each other, much like

real cars. The runtime per tracking point v ∈V then is

+|R

max

(v)|, where P

is the number of participants

at v and R

max

(v) is the set of maximal length routes

beginning in v. Storage is required only to store the

set R

max

(v). The curator runs in time P + |R |.

Alg. 3 similarly ofﬂoads the generation of ghost

cars to the tracking points. These now run in time

at most

T ·|V |·p

1−p

+ P

each. They might run signiﬁ-

cantly faster on average, depending on the structure

of G: The

T ·|V |·p

1−p

ghost cars will distribute over G

in general, but not necessarily uniformly (vertices re-

ceive ghost cars proportional to their in-degree in ev-

ery time-step). The tracking points need only store

a representation of Adj(v). The curator is as above.

While overall storage seems lower, the actual noise-

to-ID binding now reside on network link buffers.

Discussion. We see that while alg. 2 is optimal in

terms of noise, it has worst complexity in the cen-

tralized setting and the distributed setting. Alg. 1 on

the other hand cannot be implemented in a distributed

fashion and incurs a large amount of noise. We can

see that alg. 3 strikes a balance between both algo-

rithms in terms of runtime and space – both in the cen-

tralized and distributed settings – provided p is chosen

appropriately. The previous subsection showed that it

can outperform alg. 1 in terms of noise to some extent.

6 CONCLUSION

We have introduced a model for decentralized trafﬁc

monitoring in the smart city based on vehicle track-

ing. We then introduced a notion of adjacency that ﬁts

this model and permits us to study ε DP in this con-

text. Building on that, we presented three algorithms

that each achieve ε DP in our setting. Each algorithm

has unique advantages and disadvantages in terms of

noise, runtime, and their ability to be deployed in a

distributed fashion. Together, they showcase the var-

ious engineering tradeoffs that a practioner might en-

counter when applying this model to a speciﬁc city.

Our ﬁrst algorithm is simple to implement, yet in-

curs high noise as it requires Laplace noise scaled to

T . Moreover, it cannot be run in a distributed fash-

ion. Remarkably, despite the superﬁcially similar set-

ting, the well-known binary tree technique cannot be

ported to this setting. We next showed that a depen-

dence on T in terms of noise is altogether unneces-

sary. The resulting algorithm can run in a distributed

fashion, but incurs high runtime overhead. Our third

algorithm is a hybrid approach striking a balance be-

ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy

764

tween the ﬁrst two algorithms. It is reasonably efﬁ-

cient in a distributed setting and outperforms the ﬁrst

algorithm in terms of noise when simulated.

We believe that our results show the merit of hy-

bridizing DP algorithms in a probabilistic way, specif-

ically when working in the presented context of trafﬁc

monitoring. While the resulting algorithms are more

difﬁcult to analyze, they perform reasonably well. We

believe that future work can improve this situation.

For example, hybrid schemes could adapt to the topol-

ogy of the graph, covering routes that have a higher

probability of “loosing” a ghost with a higher number

of the same. Finally, the beneﬁts of adopting (ε, δ)

DP, where δ > 0, may hold signiﬁcant improvements

in terms of noise at the cost of a modest imbalance in

the privacy guarantees.

REFERENCES

Bhardwaj, V., Rasamsetti, Y., and Valsan, V. (2022). Trafﬁc

control system for smart city using image processing.

AI and IoT for Smart City applications, pages 83–99.

Cardoso, A. R. and Rogers, R. (2022). Differentially pri-

vate histograms under continual observation: Stream-

ing selection into the unknown. In International Con-

ference on Artiﬁcial Intelligence and Statistics, pages

2397–2419. PMLR.

Chan, T.-H. H., Shi, E., and Song, D. (2011). Private and

continual release of statistics. ACM TISSEC, 14(3):1–

24.

Chen, Q., Ni, Z., Zhu, X., and Xia, P. (2023). Differen-

tial privacy histogram publishing method based on dy-

namic sliding window. Frontiers of Computer Science,

17(4):174809.

Djahel, S., Doolan, R., Muntean, G.-M., and Murphy, J.

(2015). A communications-oriented perspective on

trafﬁc management systems for smart cities: Chal-

lenges and innovative approaches. IEEE Communi-

cations Surveys & Tutorials, 17(1):125–151.

Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006).

Calibrating noise to sensitivity in private data analysis.

In Theory of Cryptography, pages 265–284. Springer.

Dwork, C., Naor, M., Pitassi, T., and Rothblum, G. N.

(2010). Differential privacy under continual observa-

tion. In Proceedings of the Forty-Second ACM Sym-

posium on Theory of Computing, STOC ’10, page

715–724, New York, NY, USA. Association for Com-

puting Machinery.

Dwork, C., Naor, M., Reingold, O., and Rothblum, G. N.

(2015). Pure differential privacy for rectangle queries

via private partitions. In International Conference on

the Theory and Application of Cryptology and Infor-

mation Security, pages 735–751. Springer.

Dwork, C., Roth, A., et al. (2014). The algorithmic founda-

tions of differential privacy. Foundations and Trends®

in Theoretical Computer Science, 9(3–4):211–407.

Gade, D. (2019). Ict based smart trafﬁc management sys-

tem “ismart” for smart cities. International Journal of

Recent Technology and Engineering, 8(3):1000–1006.

Gelderie, M., Luff, M., and Brodschlem, L. Differential pri-

vacy for distributed trafﬁc monitoring in smart cities

(full version).

Gracias, J. S., Parnell, G. S., Specking, E., Pohl, E. A., and

Buchanan, R. (2023). Smart cities—a structured liter-

ature review. Smart Cities, 6(4):1719–1743.

Hassan, M. U., Rehmani, M. H., and Chen, J. (2019). Dif-

ferential privacy techniques for cyber physical sys-

tems: a survey. IEEE Communications Surveys & Tu-

torials, 22(1):746–789.

Henzinger, M., Sricharan, A., and Steiner, T. A. (2023).

Differentially private data structures under continual

observation for histograms and related queries. arXiv

preprint arXiv:2302.11341.

Husnoo, M. A., Anwar, A., Chakrabortty, R. K., Doss, R.,

and Ryan, M. J. (2021). Differential privacy for iot-

enabled critical infrastructure: A comprehensive sur-

vey. IEEE Access, 9:153276–153304.

Jain, P., Raskhodnikova, S., Sivakumar, S., and Smith, A.

(2023). The price of differential privacy under contin-

ual observation. In International Conference on Ma-

chine Learning, pages 14654–14678. PMLR.

Khanna, A., Goyal, R., Verma, M., and Joshi, D. (2019).

Intelligent trafﬁc management system for smart cities.

In Futuristic Trends in Network and Communication

Technologies, pages 152–164. Springer Singapore.

Kumar, A., Upadhyay, A., Mishra, N., Nath, S., Yadav,

K. R., and Sharma, G. (2022). Privacy and security

concerns in edge computing-based smart cities. In

Robotics and AI for Cybersecurity and Critical Infras-

tructure in Smart Cities, pages 89–110. Springer.

Li, Y., Zhang, P., and Wang, Y. (2018). The location privacy

protection of electric vehicles with differential privacy

in v2g networks. Energies, 11(10):2625.

Ma, Z., Zhang, T., Liu, X., Li, X., and Ren, K. (2019). Real-

time privacy-preserving data release over vehicle tra-

jectory. IEEE transactions on vehicular technology,

68(8):8091–8102.

Qu, Y., Nosouhi, M. R., Cui, L., and Yu, S. (2019). Privacy

preservation in smart cities. In Smart cities cyberse-

curity and privacy, pages 75–88. Elsevier.

Rizwan, P., Suresh, K., and Babu, M. R. (2016). Real-time

smart trafﬁc management system for smart cities by

using internet of things and big data. In 2016 Interna-

tional Conference on Emerging Technological Trends.

Sun, Y.-E., Huang, H., Yang, W., Chen, S., and Du, Y.

(2021). Toward differential privacy for trafﬁc mea-

surement in vehicular cyber-physical systems. IEEE

Transactions on Industrial Informatics, 18(6):4078–

4087.

Yao, A., Li, G., Li, X., Jiang, F., Xu, J., and Liu, X. (2023).

Differential privacy in edge computing-based smart

city applications: Security issues, solutions and future

directions. Array, page 100293.

Zhou, Z., Qiao, Y., Zhu, L., Guan, J., Liu, Y., and Xu,

C. (2018). Differential privacy-guaranteed trajec-

tory community identiﬁcation over vehicle ad-hoc net-

works. Internet Technology Letters, 1(3):e9.

Differential Privacy for Distributed Trafﬁc Monitoring in Smart Cities

765