A Tool-assisted Methodology for the Data Protection Impact Assessment

Salimeh Dashti

1,2

and Silvio Ranise

Security and Trust-Fondazione Bruno Kessler, Trento, Italy

DIBRIS-University of Genoa, Genoa, Italy

Keywords:

General Data Protection Regulation (GDPR), Data Processing, Risk Analysis, Compliance, Likelihood,

Impact.

Abstract:

We propose a pragmatic methodology to the Data Protection Impact Assessment (DPIA) based on a tool

capable of assisting users during crucial activities such as data processing speciﬁcation and risk analysis.

Previous work on compliance checking and our experience in developing a DPIA methodology for the Public

Administration of the province of Trento in Italy are the basis of this work.

1 INTRODUCTION

According to the Working Party 29,

a Data Pro-

tection Impact Assessment (DPIA) “is a process de-

signed to describe the processing, assess its neces-

sity and proportionality and help manage the risks

to the rights and freedoms of natural persons result-

ing from the processing of personal data by assess-

ing them and determining the measures to address

them.” DPIA is one of the most important activity

for an organization to demonstrate compliance with

the General Data Protection Regulation (GDPR). Un-

fortunately, it is complex, time-consuming, and re-

quires expertise in several domains. It is rarely the

case that organizations—especially small or middle

size ones—can afford the burden of developing an

interdisciplinary portfolio of competencies, including

cybersecurity and privacy. For larger organizations,

another issue is to maintain uniformity of the DPIA

across different departments.

To alleviate these problems, we propose a prag-

matic methodology based on our previous experience

in designing a methodology for the DPIA of the pub-

lic administration of the province of Trento in Italy

and previous academic work on compliance of secu-

rity policy (Guarda et al., 2017, Ranise and Siswan-

toro, 2017).

Our methodology is based on a tool that is capa-

ble of assisting users with the three main activities of

https://ec.europa.eu/newsroom/document.cfm?doc id=

44137

See resolution n. 450 of March 23, 2018 available at

http://www.delibere.provincia.tn.it/.

DPIA, namely (1) the analysis of the data process-

ing activities, (2) the assessment of the risks, and (3)

the run-time monitoring. In this paper, we discuss the

ﬁrst two steps and leave the description of the third

to a future paper as it is still on-going work. Since

the methodology and the tool have been designed to

tightly cooperate, it is difﬁcult to consider one of them

in isolation; instead, we regard their cooperation as

the main strength of our work. Despite this, below,

we will use the word “methodology” or “tool” when

we want to emphasize one of the two aspects.

For activity (1), the tool helps users in carrying

out crucial activities such as the functional speciﬁca-

tion of the data processing activities, the identiﬁcation

of the entities involved, their legal roles, and the ac-

cess control policies that they must satisfy. For ac-

tivity (2), it checks whether access control policies

are compliant with the provisions of the GDPR and

computes the risk level of a data processing activity

in terms of the likelihood and impact of a data breach.

The ultimate goal of our tool-based methodology is to

assist organizations to master the complexity and the

interdisciplinary competencies needed for the correct

implementation of the DPIA. Indeed, the effective use

of the tool must be complemented by adequate train-

ing on the key notions of the GDPR and the DPIA.

In the rest of the paper, we omit technical details—

e.g., the fact that the tool contains also a database of

processing activities—and focus on the capabilities of

the tool to assist users.

Plan of the paper. Section 2 describes the ﬁrst two

steps of our methodology. Section 3 discusses re-

lated work and highlights the differences or similari-

276

Dashti, S. and Ranise, S.

A Tool-assisted Methodology for the Data Protection Impact Assessment.

DOI: 10.5220/0007932202760283

In Proceedings of the 16th International Joint Conference on e-Business and Telecommunications (ICETE 2019), pages 276-283

ISBN: 978-989-758-378-0

Figure 1: An overview of our methodology.

ties with our approach. Finally, Section 4 summarizes

our ﬁndings and highlights future work.

2 OUR METHODOLOGY

In each one of the three steps in our methodology

(shown in Figure 1), the user is assisted by a tool in

gathering the necessary data to produce three docu-

ments containing the description of the DPIA activi-

ties. Such documents can be used to satisfy the ac-

countability requirements of the GDPR (art. 5. 2) and

to support the auditing process by, e.g., a (national)

privacy authority.

1. The step Processing Analysis outputs a docu-

ment, called Processing Speciﬁcation, that con-

tains a precise description of the data process-

ing activities, including the collected data, their

classes, the data subject categories involved, the

purpose, etc.

2. The step Risk Analysis outputs a document,

called Risk summary, that reports the compliance

check of access control policies against the GDPR

(this is crucial to ensure that data subjects can con-

trol the sharing of their personal data in compli-

ance with legal provisions) and the risk levels as-

sociated to each deﬁned data processing.

3. The step Run-time Analysis outputs a document,

called Asset and Event Mapping, that contains the

associations between the assets (identiﬁed in the

previous step of the methodology) and the actual

entities in the system together with the events that

are relevant for data protection so that an Inven-

tory Management system and a Security Informa-

tion and Event Management can use the associ-

ations to detect, at run-time, possible deviations

from the protection proﬁles previously speciﬁed

or violations of compliance.

Running Example. To illustrate the main concepts

of our methodology, we consider the following sce-

nario: an Italian research organization named ITOrg

provides complimentary health insurance to its Em-

ployees. To this end, ITOrg has chosen the insur-

ance company ACME as a sub-contractor. Employ-

ees wishing to opt into the complementary health in-

surance service shall provide ITOrg with proﬁle data,

including ﬁrst and last name, taxpayer identiﬁcation

number, type of contract (e.g., permanent or ﬁxed-

term), work e-mail address, and summary of the med-

ical history of the past 3 to 5 years. ITOrg forwards

the information to ACME that processes it to produce

an appropriate health insurance contract that, in turn,

is returned to the Employee via ITOrg.

Below, we discuss only the ﬁrst two steps of the

methodology and leave the third as future work.

2.1 Processing Analysis

For each processing activity, the ﬁrst step of our

methodology requires to specify the data collected,

their classes, how data are grouped in objects,

the

data subject categories involved (e.g., patients or mi-

nors), the purpose (e.g., advertising or billing), which

entities are playing which legal roles (e.g., data con-

troller or data subject), when and how the consent is

obtained, and the adequacy (necessity, proportional-

ity, and legal basis) of the collected data.

The data categories involved in the complemen-

tary insurance scenario are those typically associated

to a research organization as ITOrg, namely Employ-

ees (researchers and administrative staff), (external)

Collaborators (for consulting and cooperation in re-

search projects), and Students (for training periods).

Since it is not always easy to identify the data classes

and data subject categories involved in a data process-

ing and satisfy the requirements of necessity and pro-

portionality stipulated in art. 35.7.b of the GDPR, the

tool supporting our methodology provides hints by

using a schema based on economic sectors. Table 1

shows an excerpt of such a schema where the ﬁrst col-

umn reports the sectors taken from the standard classi-

ﬁcation of economic activities in the European Com-

munity (NACE);

the second column shows the asso-

ciated data subject categories; and the third column

reports the associated data classes. The legend of the

table explains the abbreviations for data subject cate-

gories whereas the acronyms used for data classes are

the following: PD stands for Personal Data, PD-HR

for Personal Data with High Risk, and SPD for Spe-

cial category of Personal Data. While PD and SPD are

taken from the GDPR, we have introduced PD-HR to

A data object can be seen as a record of ﬁelds, i.e. a

collection of values with given types. Examples of data ob-

jects are the rows of a relational database and certain data

structures of programming languages.

https://ec.europa.eu/eurostat/ramon/nomenclatures/ind

ex.cfm?TargetUrl=LST CLS DLD&StrNom=NACE REV

2&StrLanguageCode=EN&StrLayoutCode=HIERARCHI

A Tool-assisted Methodology for the Data Protection Impact Assessment

277

Table 1: An excerpt from economic sectors with data classes and data subject categories.

Sectors Data Subject Categories Data Classes

Information and communication E, C, Ci, PD, PD-HR: location

Telecommunication N, L, M, P

Professional, scientiﬁc & technical activities E, C, PD

legal and accounting activities N, L, Ci PD-HR: Financial status

Scientiﬁc research & development S, Pr PD-HR, SPD

Education E, C, Pr, S, D, Ci

PD, PD-HR: Financial status

SPD:Health Data

Pre-primary, Primary, sport & culture M, P

Legend: E(mployee), M(inor), Ci(tizen), P(arent), Pa(tient), C(ollaborator), N(atural Person), L(egal Person),

S(tudent), No(minee), De(tainee), P(olice)/M(ilitary) forces, D(isable), Pr(ofessor/Teacher/Trainer)

make the classiﬁcation more ﬁne-grained to simplify

the second step of our methodology (i.e. Risk Analy-

sis). In the ﬁrst column, the main sectors of NACE

are highlighted in light gray (e.g., Information and

communication), while those in white are sub-sectors

(e.g., Telecommunication). In the complementary in-

surance scenario, the user selects the sector ‘Profes-

sional, scientiﬁc & technical activities’ and the sub-

sector ‘Scientiﬁc research & development.’

The table shows only the sub-sectors which have

either more data class or data subject category than its

main sectors; as sub-sectors inherit from its main sec-

tor. For instance, the complementary insurance sce-

nario which falls under sub-sector ’Scientiﬁc research

& development,’ can access SPD and PD-HR (accord-

ing to art. 9.2.j) as well as PD from the main sector;

and to S(tudent) and Pr(ofessor/Teacher/Trainer) with

E(mployee) and C(ollaborator) as data subject cate-

gories.

The three data classes are ordered as follows: SPD

is a strict sub-set of PD-HR that in turn is a strict

subset of PD. Indeed, this means that PD, PD-HR,

and SPD have an increasing level of sensitivity. We

also consider sub-classes (such as Location or Polit-

ical Opinion) of the main data classes that are men-

tioned in GDPR or recitals (e.g., recital 75) for a more

precise speciﬁcation of the data processing activities

and—similarly to what has been done above for data

subject categories—to ease the Risk Analysis step.

We exploit the information contained in Table 1 as

follows. Users indicate only the economic sector to

which their organization belongs and the tool returns

a series of suggestions for the data classes and data

subject categories that are more likely to be relevant

to the data processing activities of the organization.

In this way, the user is presented with a restricted sub-

set of choices: this, not only, alleviates the burden of

considering multiple options but also helps in satisfy-

ing the necessity and proportionality requirements of

art. 35.7.b of the GDPR. This process is fully trans-

parent to users, who are free to add more data classes

and data subject categories as needed, attaching short

justiﬁcations. The identiﬁcation of the economic sec-

tor and related data subject categories and data classes

is done once and for all data processing of an organi-

zation.

We use automated reasoning techniques to mech-

anize the suggestion generation activity. We encode

the schema in Table 1 into formulae of propositional

logic and use the capabilities of a constraint solver to

generate the hints.

While we have validated part of the schematiza-

tion in Table 1 in our experience of applying the pro-

posed methodology in the public administration of the

province of Trento in Italy, we believe that there is

room for improvements and reﬁnements. The feed-

back of users of the tool will be key to improve the

precision of the relationships in the table.

We now consider the other information needed

to describe a processing activity. Consider again

the complementary insurance scenario, although it is

tempting to deﬁne the purpose of the processing sim-

ply as the ‘production of the contract for complemen-

tary insurance,’ this is not enough for verifying that a

certain processing step is performed for the declared

purpose. What is missing, is the deﬁnition of the plan

(i.e. the context) to which the processing step belongs,

to achieve the declared purpose; see, e.g., (De Masel-

lis et al., 2015) for a discussion on this issue. For this

reason, we deﬁne the purpose of a processing activ-

ity as a plan (i.e. the sequence of steps) that must be

executed to achieve a certain goal. The tool supports

well-known standards for the speciﬁcation of plans,

such as Message Sequence Charts and Business Pro-

cess Modeling Notation. To avoid technicalities, here

we propose a natural language speciﬁcation of the

plan for the complementary insurance scenario:

1. Employees provide some information, such as

full name, taxpayer identiﬁcation number, type of

their contract and the medical history, via a data

object insurance complementary form and pass it

SECRYPT 2019 - 16th International Conference on Security and Cryptography

278

to ITOrg;

2. ITOrg informs Employees that the insurance com-

plementary form will be shared with ACME to pro-

duce a complementary health insurance contract;

and ask them for their consent;

3. ITOrg passes the form to ACME;

4. ACME reads the data insurance complementary

form, produces a contract, and sends it to ITOrg;

5. ITOrg forwards the contract to the Employees.

Besides deﬁning the purpose of the processing, the

plan speciﬁes: the data objects used, here is insur-

ance complementary form; their content (for the sake

of brevity, here we omit the details); the source of data

(step 1), here are the data subjects themselves; and

when and how the consent is obtained from the data

subject (step 2). The plan also helps in clarifying the

roles played by the involved entities and why these are

entitled to perform some processing step on the data

objects. In the complementary insurance scenario,

ITOrg—playing the role of controller—mandates (via

an appropriate contract) the third party organization

ACME—playing the role of data processor—to per-

form the necessary computations on the data object

insurance complementary form provided by Employ-

ees—playing the role of data subjects. Some care is

needed when considering the capabilities of the data

subject. While the GDPR states that data subjects

have unrestricted access to their data, an IT system

is designed to empower them with only a subset of

their rights, i.e. those that are needed to guarantee that

the processing achieves its purpose. In practice, data

subjects retain all rights but exercise them in ways

which are not immediately available in the data pro-

cessing under consideration. In the complementary

insurance scenario, ITOrg empowers Employees with

read, write, and update rights on their proﬁle data; but

not delete right. While the latter is indeed a data sub-

ject’s right, it is reasonable that such a “delicate” op-

eration is supported in other ways, and it should not

be considered as a lack of compliance the fact that the

data subject is not granted all permissions at any time

and in any context. The permissions that are granted

to the various entities involved in the data processing

are speciﬁed by an access control policy. The tool

supports the speciﬁcation of such policies written in

the Attribute Based Access Control (ABAC) frame-

work (Hu et al., 2013) for its well-known capability

to express and combine a wide range of different pol-

icy idioms (Jin et al., 2012). The access control policy

is speciﬁed at the level of organization and it holds for

each processing activity speciﬁed in the DPIA; this is

aligned with the desideratum of organizations to have

a single point of administration of policies that must

be enforced uniformly across (possibly distributed)

operational units. To avoid technicalities, we report a

rule of a policy for the complementary insurance sce-

nario in natural language: ACME can “modify” the

content of insurance complementary form. We will

see below that this rule causes a compliance violation.

2.2 Risk Analysis

The second step of our methodology is organized in

two sub-tasks: (i) verifying compliance of access con-

trol policies and (ii) evaluating the likelihood and im-

pact. For the former, we adapt to the GDPR the ap-

proach in (Guarda et al., 2017, Ranise and Siswan-

toro, 2017), developed to check compliance for its

precursor, the European Data Protection Directive.

For the second, there are several available standards to

adopt during a DPIA. To allow users of our method-

ology to choose their favorite approach, we make the

common assumption that risk is the product of two

factors: the likelihood of an adverse event (such as a

data breach) and the impact of the event on the funda-

mental rights and freedoms of the data subject (e.g.,

reputational damage). Before describing in details

each sub-task, we make the following observation.

While sub-task (ii) can be easily seen to ﬁt in the

context of risk analysis as soon as we observe that it

is standard to decompose risk into likelihood and im-

pact, it may be less clear why we consider the com-

pliance checking of access control policies—i.e. sub-

task (i)—as part of risk analysis. To understand why

this is the case, we observe that modern approaches

decompose access control in three main compo-

nents: policy language, model, and enforcement—

see, e.g., (De Capitani di Vimercati et al., 2007). The

ﬁrst two deﬁne in abstract mathematical terms the

syntax and semantics, respectively, of access control

policies whereas the third speciﬁes the mechanisms

put in place to enforce the policies. One of the main

advantage of this approach is to separate the anal-

ysis of high-level security properties from the cor-

rectness of the enforcement mechanisms. In fact, it

is well-known that writing and maintaining policies

is an error-prone activity because of the possibility

of inserting redundancies, conﬂicts, and other logical

problems—see, again (De Capitani di Vimercati et al.,

2007) for an introduction to this and related prob-

lems. Similarly, guaranteeing the compliance of ac-

cess control policies expressed in mathematical terms

against legal provisions is difﬁcult, even disregard-

ing the risks deriving from vulnerable implementa-

tions of the enforcement mechanisms, that are eval-

uated in sub-task (ii). Logical errors in access con-

trol policies prevent compliance already at the design

level and thus constitute an important source of risk

A Tool-assisted Methodology for the Data Protection Impact Assessment

279

to identify and eliminate; sub-task (i) is designed to

achieve this objective.

2.2.1 Verifying Compliance of Access Control

Policies

The literature on the security analysis of access con-

trol policies offers a wide range of techniques to auto-

matically solve several types of policy analysis prob-

lems; see, e.g., (Turkmen et al., 2017). One of such

problems is that of reﬁnement; a policy p reﬁnes a

policy p

iff every authorization requests permitted or

negated by p is also so by p

. Our tool reduces the

compliance of the access control policy p of the or-

ganization with respect to the GDPR to check that p

reﬁnes the policy obtained by instantiating a selected

subset of the articles of the GDPR that are relevant

to control access to the data processing activity. In

this way, every authorization requests permitted or

negated by the policy of the organization is also so

by the policy derived from the GDPR. The instantia-

tion of the GDPR consists of mapping the legal roles

to the entities involved in the data processing as spec-

iﬁed in the ﬁrst step of the methodology as well as

taking into consideration the relationships of “man-

date” and “empower” between the data controller and

data processor or data subject, respectively.

To illustrate, consider again the complementary

insurance scenario and recall the rule of the access

control policy speciﬁed at the end of Section 2.1, i.e.

ACME can “modify” the content of insurance com-

plementary form. A rule instantiated from the GDPR

to the use case scenario stipulates that ACME (data

processor) can “read” (process) the data object in-

surance complementary form (which contains health

data) since the ITOrg (data controller) has mandated

ACME to do so. By taking into consideration these

two rules, an automated tool for policy analysis is able

to detect a violation—because the mandate relation

speciﬁes that ACME can only read such an object—

and return one or more authorization requests that ex-

pose such a violation to help users understanding (and

hopefully eliminating) it. Details are in (Guarda et al.,

2017, Ranise and Siswantoro, 2017); the use of these

policy analysis techniques is fully transparent to the

user.

2.2.2 Evaluating Likelihood and Impact

We describe how our methodology and tool support

users to evaluate likelihood and impact to determine

the risk level of each data processing.

Likelihood. There are several methods and standards

(e.g., ISO 27001:2013)

for likelihood evaluation that

https://www.iso.org/standard/54534.html

can be re-used almost off-the-shelf for a DPIA. Since

users may prefer certain approaches over others, our

tool allows importing results from external tools or

by manual entry. For instance, a possible (coarse-

grained) way to evaluate likelihood is to ﬁrst identify

the functionalities and assets used to implement a cer-

tain data processing activity, deﬁne indicators to mea-

sure the adoption of security mechanisms (such as au-

thentication and access control) and privacy enhanc-

ing technologies (e.g., pseudo-anonymization), com-

plement them to estimate the likelihood of a violation,

and ﬁnally take the maximum of the values as a ﬁrst

approximation of the likelihood. More sophisticated

approaches can be used only when needed, e.g., when

the ﬁrst estimate of the likelihood is above an “accept-

able” threshold.

Our tool supports this reﬁne-and-

check approach to likelihood evaluation by permitting

the speciﬁcation of alert conditions (e.g., is the likeli-

hood value above a certain threshold?) to signal that

a more precise technique should be used. The alert

conditions are predicates, deﬁned on several indica-

tors that measure the likelihood of a security incident.

Examples of such indicators are the likelihood that

standard security (e.g., conﬁdentiality, integrity, and

availability) or privacy (e.g., linkability, transparency,

and intervenability) properties are violated. Once the

various indicators are calculated, the tool supports the

deﬁnition of a function to compute a single value for

the likelihood (the default function is to take the max-

imum value of the indicators following a cautious ap-

proach). We observe that tracking several indicators

is important for conducting a good DPIA for two rea-

sons. First, the values of the indicators suggest which

data protection mechanisms to improve to reduce the

likelihood of a data breach. Second, the availability

of several indicators improves accountability and sim-

pliﬁes the task of impact evaluation—as we will see

below.

There are two (easily satisﬁed) requirements for

integrating an external method of likelihood evalua-

tion in our methodology. The ﬁrst one is to normal-

ize all values on a scale between 1 (unlikely) and

5 (almost certain). The second one (derived from

art. 35.7.a of the GDPR) is to complement the values

of indicators with a speciﬁcation of the technological

system used to implement the data processing activi-

ties that contain at least a description of its function-

alities and the assets—including hardware, software,

networks, and people—used to store, manipulate, and

transfer personal data. In some cases, such informa-

For the time being, we leave the task of setting the ap-

propriate threshold value to the user. Such a value may de-

pend on considerations that are difﬁcult to quantify; e.g.,

the fact that a processing activity supports (or not) one of

the main business goals of the organization.

SECRYPT 2019 - 16th International Conference on Security and Cryptography

280

tion is already available as a pre-requisite for applying

many of the available methodologies and can be read-

ily imported.

Impact. It is difﬁcult to re-use pre-GDPR method-

ologies (e.g., ISO 27001) for impact evaluation in

the context of a DPIA since they consider the im-

pact on the business goals of the organization (i.e.

with respect to the data controller or the data proces-

sor) rather than on the fundamental rights and free-

doms of data subjects. After the introduction of the

GDPR, there has been work to align some methodolo-

gies (e.g., ISO 27005) with the GDPR or to provide

new ones (e.g., that proposed by the French privacy

authority CNiL). While it is possible to integrate the

recent methodologies in our tool, we describe our own

approach in the rest of this section with the hope of

providing further insights in the challenging process

of impact evaluation.

Our approach is divided in two phases: ﬁrst, we

ask a “subjective” evaluation of the impact to the user

and then we weight the value by using a combination

of an “objective” compliance indicators related to the

data processing. In the ﬁrst phase, the tool asks the

user to provide a ﬁrst estimate of the impact level of

a security incident for the data processing under con-

sideration, in a scale from 1 to 5 (where 1 indicates

the lowest possible severity level and 5 the highest

one). To assist users in this phase, the tool provides

three sets of information: (a) the deﬁned data classes

and the data subject categories, (b) a list of privacy

concerns, and (c) a list of potential damages on data

subjects that may result from data breaches. The tool

also explains the relevance of the sets of information

to evaluate the impact.

For (a), the idea is that the more sensitive the data

processed, the higher the impact. The tool associates

the impact based on the selected data classes, which

are normal, high, very high to personal data, personal

data with high risk and special category of data, re-

spectively. In case, vulnerable data subjects (e.g.,

minors) are associated with data processing, the user

may increase the value. The tool encourages them not

to lower the value, though. The list of potential (phys-

ical, material, or non-material) damages provided in

trol over personal data, discrimination, identity theft

or fraud, ﬁnancial loss, and damage to reputation. The

list can be extended, customized, annotated with com-

ments, and shared within an organization to help users

share their views and reach a more uniform level in

impact evaluation (in large organizations, it is often

the case that several users will participate in the DPIA

as they are responsible for different collections of data

processing activities). In the complementary insur-

ance scenario, it seems reasonable to evaluate the im-

pact as medium-high (value 4 of the scale) since the

data being processed is a special category but the in-

volved data subject categories (employee and collab-

orator) do not seem to require particular attention.

The second phase of impact evaluation requires

the user to enter values for a given set of compliance

indicators related to the data processing. The indi-

cators

are the following: transfer of data (e.g., per-

sonal data are kept in the EU or can reach a non-EU

country),

necessity and proportionality of processing

(e.g., purpose is speciﬁc, explicit, and legitimate; the

data retention period is limited or not; there is a con-

tract with data processors), the privacy information

provided to data subjects is appropriate, and organiza-

tional measures for data protection have been adopted

(e.g., personnel managing personal data has been ad-

equately trained). Concretely, the tool shows a list

of sentences about the indicators and the user is in-

vited to evaluate—in the scale from 1 (full) to 5 (very

little)—the level of compliance of the data processing

under consideration for each statement. In the case

of the complementary insurance scenario, a statement

‘employees managing personal data have been trained

on the GDPR’ can be rated 1; this means that the per-

sonnel has undergone a serious program of training

and their level of awareness has been veriﬁed.

Once the user has entered the values for all indi-

cators, the tool computes the average and uses it to

weight the value of the impact entered by the user in

the ﬁrst phase above. To guarantee uniformity in im-

pact evaluation across an organization, the tool checks

whether the values of the compliance indicators are

in line with those of similar data processing activi-

ties (we say that two processing activities are similar

when they deal with the same sets of data classes and

data subjects). In case of lack of uniformity, the tool

signals the anomaly and asks users to consider further

scrutiny to align the value; the ﬁnal decision is indeed

left to the user.

3 RELATED WORK

There are several lines of works relevant to DPIA

such as Privacy Enhancing Technologies (see e.g.

(Bennett and Raab, 2017)), the Privacy-by-Design

Derived from https://ec.europa.eu/newsroom/documen

t.cfm?doc id=44137.

In future work, we plan to reﬁne the case

of non-EU countries in those that provide ade-

quate safeguards and those that do not as sug-

gested in https://ec.europa.eu/info/law/law-topic/data-

protection/data-transfers-outside-eu/adequacy-protection-

personal-data-non-eu-countries en.

A Tool-assisted Methodology for the Data Protection Impact Assessment

281

approach (see e.g. (Blix et al., 2017)), Privacy

Impact Assessment (van Puijenbroek and Hoepman,

2017, Vemou and Karyda, 2018), and Privacy Risk

Assessment (Oetzel and Spiekermann, 2014). For

the sake of brevity, we focus on those that are more

closely related to our approach and classify these

works in two broad categories: legal (from national

and European privacy authorities) and academic (in

the scientiﬁc literature about Data Privacy Impact As-

sessments).

Legal Works. Art. 35 in the GDPR introduces the

notion of DPIA, speciﬁes under which conditions the

data controller is forced to perform such an assess-

ment, lists a set of requirements that the DPIA must

satisfy, and identiﬁes some classes of data processing

activities for which the DPIA is necessary (e.g., using

new technology, automated processing, large scale of

special category of data). The text of art. 35 does not

provide detailed guidelines to perform the DPIA and

this has stimulated contributions from ofﬁcial bodies

and authorities.

For instance, the Article 29 Working Party (WP)

proposes an iterative process for carrying out a DPIA

consisting of the following seven steps organized in a

cycle: (1) description of the envisaged processing, (2)

assessment of the necessity and proportionality, (3)

data protection measures already envisaged, (4) as-

sessment of the risks to the right and freedoms of data

subjects, (5) data protection measures envisaged to

address the risks, (6) documentation, (7) monitoring

and reviewing. Our methodology covers these steps,

as follows: Processing Analysis corresponds to steps

(1) and (2); Risk Analysis to steps (3), (4), and (5);

Run-time Analysis to step (7); and the three docu-

ments generated by our tool covers step (6). Along

similar lines, national privacy authorities—such as

CNiL and ICO—have put forward guidelines to con-

duct a DPIA (CNil, 2018, CNil, 2015, ICO, 2018).

While many such proposals do not provide tool-

support, the one by CNiL introduces a tool that re-

quires to ﬁll in a detailed form, containing around 350

questions and guide users through the DPIA. The tool

returns a visual representation of the collected data

that include the considerations of the Data Protection

Ofﬁcer (DPO) and the Data Subjects (or their repre-

sentatives). By using the output of the tool, users can

decide which data protection mechanisms are needed

to reduce risks to an acceptable level with justiﬁca-

tions to the DPO and the Data Subjects. While our

work shares the same goal of providing a tool to sup-

port the implementation of a DPIA, the main differ-

ence is that the tool proposed by CNiL resembles a

check-list and does not provide effective assistance

as it is the case of our tool. For instance, our tool

suggests the relevant data subject categories and data

classes to the economic sector to which the organiza-

tion belongs (see Section 2.1 and Table 1); that assists

the user to comply with the necessity and proportion-

ality requirements (art. 35. 7. b). While, the tool by

CNiL asks the user to identify such information, with-

out any further assessment.

Similar observations hold concerning impact eval-

uation (see Section 2.2.2)—that as we have already

discussed—is a difﬁcult activity because of the need

to bridge the large gap between data breaches (tech-

nological level) and the rights and freedoms of data

subjects (legal and social level). Our tool provides a

guided approach to perform such an evaluation with

the goal of controlling subjectivity and guaranteeing

uniformity across different data processing activities;

both goals are difﬁcult to achieve by using a form-

based tool such as the one proposed by CNiL.

A distinctive feature of our approach is the em-

phasis on access control policies and the veriﬁcation

of their compliance against the legal provisions of

the GDPR (see Section 2.2.1). As already observed,

checking compliance at the logical level of the poli-

cies, i.e. disregarding vulnerabilities in their enforce-

ment, is already important to implement a privacy-

by-design approach and avoid compliance violations

that may dramatically increase the risk level of data

processing activities. To further support the impor-

tance of checking policy compliance already at de-

sign time, it is important to observe that articles 5.1.f

and 32.2 of the GDPR—together with recitals 39 and

49—state that personal data should be processed in a

manner that ensures an appropriate level of protection

against destruction and unauthorized access; indeed,

access control is one of the key security mechanisms

to guarantee this. To the best of our knowledge, no

other work (including those from the academy, see

below) considers this aspect of risk analysis.

Academic Works. Only recently, scientiﬁc papers

have been published on DPIA. For instance, the au-

thors of (Coles et al., 2018) use UML class diagrams

to specify crucial requirements underlying various as-

pects of a DPIA such as consent and necessity. The

focus of this work is to integrate security and privacy

requirements engineering processes into a DPIA and

understand how a previously developed tool for risk

analysis of UML diagrams can be effectively used in

this context. They argue in which steps of a DPIA, re-

quirements engineering processes may be helpful and

supported by a tool. However, they do not consider

the risk to rights and freedoms of data subjects which

is the focus of the GDPR.

The authors in (Alnemr et al., 2015) offer a tool-

supported DPIA for cloud deployments. Their ap-

proach is based on two questionnaires: the former

aims to assess whether the DPIA is necessary while

SECRYPT 2019 - 16th International Conference on Security and Cryptography

282

the latter is to establish how the interactions between

the subjects that perform the DPIA and the Cloud

Service Providers affect data subjects’ rights to data

protection. The questions have some pre-deﬁned an-

swers; they are weighted according to the impact they

have on privacy; the weights are used to calculate an

impact score. Each question is also associated to mul-

tiple privacy indicators to capture different privacy as-

pects (such as sensitivity, compliance and data con-

trol) that can enhance or be detrimental to privacy;

a global privacy indicator is then calculated based on

the indicators. While we share some similarities in the

risk analysis phase, our tool is agnostic with respect

to the technology used to implement the data process-

ing activities. Consequently, the Risk Analysis step

of our methodology is parametric with respect to the

particular technique used for risk evaluation.

4 CONCLUSIONS AND FUTURE

WORK

We have presented the ﬁrst two steps of our tool-

assisted methodology that combines automated policy

analysis techniques with a ﬂexible risk-based evalua-

tion to implement a substantial part of a DPIA. Some

ideas have been developed and implemented when

contributing to the deﬁnition of a DPIA methodology

for the public administration of province of Trento in

Italy that comprises more than 2,000 processing ac-

tivities (of which 650 handles special category of per-

sonal data) that are distributed across (almost) 100

organizational units. To permit effective use of the

methodology and the tool, the training sessions were

crucial.

In future work, we plan to investigate how to com-

bine data analytic techniques with selected monitor-

ing tools—such as those for Inventory Management

(IM) and Security Information and Event Manage-

ment (SIEM)—to map assets (ﬁrst step) and like-

lihood indicators (second step) and implement the

run-time analysis (third) step—cf. Figure 1—of our

methodology to make the methodology continuous.

REFERENCES

Alnemr, R., Cayirci, E., Dalla Corte, L., Garaga, A.,

Leenes, R., Mhungu, R., Pearson, S., Reed, C.,

de Oliveira, A. S., Stefanatou, D., et al. (2015). A data

protection impact assessment methodology for cloud.

In Annual Privacy Forum, pages 60–92. Springer.

Bennett, C. J. and Raab, C. D. (2017). The governance

of privacy: Policy instruments in global perspective.

Routledge.

Blix, F., Elshekeil, S. A., and Laoyookhong, S. (2017). Data

protection by design in systems development: From

legal requirements to technical solutions. In 12th In-

ternational Conference for Internet Technology and

Secured Transactions (ICITST), pages 98–103. IEEE.

CNil (2015). How to carry out a pia. https://www.cnil.fr/

sites/default/ﬁles/typo/document/CNIL-PIA-1-Metho

dology.pdf.

CNil (2018). Privacy risk assessment (pia).

https://www.cnil.fr/sites/default/ﬁles/atoms/ﬁles/cnil-

pia-1-en-methodology.pdf.

Coles, J., Faily, S., and Ki-Aries, D. (2018). Tool-

supporting data protection impact assessments with

cairis. In IEEE 5th International Workshop on Evolv-

ing Security & Privacy Requirements Engineering

(ESPRE), pages 21–27. IEEE.

De Capitani di Vimercati, S., Foresti, S., Jajodia, S., and

Samarati, P. (2007). Access control policies and lan-

guages. JCSE, 3(2):94–102.

De Masellis, R., Ghidini, C., and Ranise, S. (2015). A

declarative framework for specifying and enforcing

purpose-aware policies. In STM, volume 9331 of

Lecture Notes in Computer Science, pages 55–71.

Springer.

Guarda, P., Ranise, S., and Siswantoro, H. (2017). Security

analysis and legal compliance checking for the design

of privacy-friendly information systems. In SACMAT,

pages 247–254. ACM.

Hu, V. C., Ferraiolo, D., Kuhn, R., Friedman, A. R.,

Lang, A. J., Cogdell, M. M., Schnitzer, A., Sandlin,

K., Miller, R., Scarfone, K., et al. (2013). Guide

to attribute based access control (abac) deﬁnition

and considerations (draft). NIST special publication,

800(162).

ICO (2018). Data protection impact assessments (dpias).

https://ico.org.uk/for-organisations/guide-to-data-

protection/guide-to-the-general-data-protection-

regulation-gdpr/data-protection-impact-assessments-

dpias/.

Jin, X., Krishnan, R., and Sandhu, R. (2012). A Uni-

ﬁed Attribute-Based Access Control Model Covering

DAC, MAC and RBAC. In DBSec, number 7371 in

LNCS, pages 41–55.

Oetzel, M. C. and Spiekermann, S. (2014). A systematic

methodology for privacy impact assessments: a de-

sign science approach. European Journal of Informa-

tion Systems, 23(2):126–150.

Ranise, S. and Siswantoro, H. (2017). Automated legal

compliance checking by security policy analysis. In

SAFECOMP Workshops, volume 10489 of Lecture

Notes in Computer Science, pages 361–372. Springer.

Turkmen, F., den Hartog, J., Ranise, S., and Zannone, N.

(2017). Formal analysis of XACML policies using

SMT. Computers & Security, 66:185–203.

van Puijenbroek, J. and Hoepman, J.-H. (2017). Privacy im-

pact assessments in practice: Outcome of a descriptive

ﬁeld research in the netherlands. International Work-

shop on Privacy Engineering.

Vemou, K. and Karyda, M. (2018). An evaluation frame-

work for privacy impact assessment methods. MCIS

Proceedings. 5.

A Tool-assisted Methodology for the Data Protection Impact Assessment

283