clear which open data model can be used to reduce 
the risk on open data privacy violations. An open data 
model is needed that helps making decisions on 
opening data and that provides insight in whether the 
data may violate users’ privacy.  
The objective of this paper is to propose a model 
to analyse privacy violation risks of publishing open 
data. To do so, a new set of what are called open data 
attributes is proposed. Open data attributes reflect 
privacy risks versus benefits trade-offs associated 
with the expected use scenarios of the data to be open. 
Further, these attributes are evaluated using a 
decision engine to a privacy risk indicator (PRI) and 
a privacy risk mitigation measure (PRMM). In 
particular this can help to determine whether to open 
data or keep it closed.  
This paper is organized as follows. Section 2 
discusses related work while section 3 presents 
privacy violation risks associated with open data, 
followed by section 4 which introduces the proposed 
model. The model helps identifying the risks and 
highlights possible alternatives to reduce these risks. 
Section 5 exemplifies the model by providing some 
use cases and preliminary results. Section 6 discusses 
the key findings and concludes the paper. 
2  RELATED WORK 
Public bodies are considered the biggest creators of 
data in the society in what is known as public data. 
Public data may range from data on procurement 
opportunities, weather, traffic, tourist, energy 
consumption, crime statistics, to data about policies 
and businesses (Janssen and van den Hoven 2015). 
Data can be classified into different levels of 
confidentiality, including confidential, restricted, 
internal use and public (ISO27001 2013). We 
consider public data that has no relation with data 
about citizens as outside the scope of this work. 
Anonymized data about citizens can be shared to 
understand societal problems, such as crime or 
diseases. An example of citizen data is the sharing of 
patient data to initiate collaboration among health 
providers which is expected to be beneficial to the 
patient and researchers. The highly expected benefits 
behind this data sharing are the improved 
understanding of specific diseases and hence 
allowing for better treatments. It can also help 
practitioners to become more efficient. For example, 
a general practitioner can quickly diagnose and 
prescribe medicines. Nevertheless, this sharing of 
patients’ information should be done according to 
data protection policies and privacy regulations.  
A variety of Data Protection Directives has been 
created and implemented. Based on the Data 
Protection Directive of 1995 (European Parliament 
and the Council of the European Union 1995), a 
comprehensive reform of data protection rules in the 
European Union was proposed by the European 
Commission (2012). Also the Organization for 
Economic Co-operation and Development has 
developed Privacy Principles (OECD, 2008), 
including principles such as “There should be limits 
to the collection of personal data” and “Personal data 
should not be disclosed, made available or otherwise 
used for purposes other than those specified in 
accordance with Paragraph 9 except: a) with the 
consent of the data subject; or b) by the authority of 
law.” In addition, the ISO/IEC 29100 standard has 
defined 11 privacy principles (ISO/IEC-29100 2011). 
Nowadays a relatively new approach for privacy 
protection called privacy-by-design has received 
attention of much organization such as the European 
Network and Information Security Agency (ENISA). 
Privacy-by-Design suggests integrating privacy 
requirements into the design specifications of 
systems, business practices, and physical 
infrastructures (Hustinx 2010). In the ideal situation 
data is collected in such a way that privacy cannot be 
violated.  
The Data Protection Directives are often defined 
on a high level of abstraction, and provide limited 
guidelines for translating the directives to practice. 
Despite the developed Data Protection Directives and 
other data protection policies, organizations still risk 
privacy violations when publishing open data. In the 
following sections we elaborate on the main risks of 
privacy violation associated with open data.  
A number of information security standards were 
estalished to achieve effective information security 
governance, among which are ISO (2013), COBIT5 
and NIST (2016). Most work on privacy risk 
assessment aim to conduct surveys or questionnaires 
that assess companies’ ways of dealing with personal 
data according to regulatory frameworks and moral or 
ethical values. When it comes to open data, such 
frameworks to assess privacy risks cannot be used 
since the data to be published will contain no 
identifying information as a pre-requisite by the law. 
Having said that, normal ways of assessing privacy 
risks cannot be applied and new ways are needed that 
outweigh the benefits of sharing the data compared to 
expected privacy risks of the leakage of personally 
identifiable information. 
Opening More Data - A New Privacy Risk Scoring Model for Open Data
147