DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE
ON SMART GRIDS
Joan Navarro
1
, Jos´e Enrique Armend´ariz-I˜nigo
2
and August Climent
1
1
Distributed Systems Research Group, La Salle, Ramon Llull University, 08022 Barcelona, Spain
2
Dpto. Ing. Matem´atica e Inform´atica, Universidad P´ublica de Navarra, 31006 Pamplona, Spain
Keywords:
Dynamic systems, Eventual consistency, Smart grids, Replication.
Abstract:
Most of the power network services such as voltage control, asset management, or flow monitoring are imple-
mented within a centralized paradigm. Novel distributed power generation techniques—eolian fields or local
solar panels—are decentralizing this generation paradigm and forcing companies to change their traditional
centralized infrastructure. This implies that intelligence and data sources are now spread over the whole net-
work. Smart grids may enable to manage such a change although any standard architecture to deploy them on
a power network exists. This challenges researchers to design and implement a new distributed storage system
able to offer different levels of consistency and replication depending on the physical location of the smart
sensor and according to the network needs. This paper reviews the requirements of smart grids and presents
a new dynamic storage architecture following the flavor of cloud computing. This architecture is based on a
variant of the primary copy scheme and is suitable to store all needed data and enable smart grids to solve the
required functions in a distributed way. Moreover, it is able to offer high scalability and a consistency level
similar to the one required by wireless sensor networks.
1 INTRODUCTION
Power networks are demanded to be high reliable and
available because they have to supply all the infras-
tructures of a country at anytime and anywhere. This
preventspowercompanies from updating and improv-
ing their systems because most of the changes may se-
riously affect critical services they are currently pro-
viding since novel devices might not be as tested as
older ones. This leads to inefficient—due to their
centralized nature—schemes which are expensiveand
even harder to maintain and scale.
With the growth of renewable energies the power
network centralized model not only scales but also
cannot work properly; the aforementioned renew-
able energy sources behave different than traditional
sources. Moreover, current power networks are not
able to remotely monitor power consumptions on the
low voltage (LV) network which prevents companies
from building new business strategies fitted to the end
user needs (Brown, 2008). This situation claims to
a substantial change which consists of decentraliz-
ing the power network and building a distributed sys-
tem able to fulfill the current society requirements and
technologies.
Recently, this new paradigm has also been referred
to as smart grid (intelligent grid). The goal of a smart
grid is to take advantage of the current digital tech-
nologies and build up an intelligent information sys-
tem over all devices within the power network: from
suppliers to consumers. This might allow companies
to tune the power distribution and route energy where
and when it is needed.
The purpose of this paper is to focus on the com-
puter engineering field and propose a distributed ar-
chitecture able to efficiently store and ease the com-
putation of any data generated by the power network.
This distributed storage architecture must be slightly
different than the ones used on web services (Paz
et al., 2010) or in cloud computing based storage
(White, Tom, 2009; Palankar and et al., 2008) since
smart grids demand a set of requirements have not
been explored yet.
2 STORAGE REQUIREMENTS
Smart grids, as opposite to classical power networks,
have become data driven applications since they own
a management layer which takes decisions based on
218
Navarro J., Enrique Armendáriz-Iñigo J. and Climent A..
DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE ON SMART GRIDS.
DOI: 10.5220/0003507702180221
In Proceedings of the 6th International Conference on Software and Database Technologies (ICSOFT-2011), pages 218-221
ISBN: 978-989-8425-76-8
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
the current status of the network. This forces to re-
define the whole power network architecture and its
specifications as now there is a need for storing and
processing data besides supplying power. Next we re-
view such needs and state the basics of our proposal.
Smart grids demand a trade off between static
(Pati˜no-Mart´ınez and et al., 2005) and dynamic
(Aguilera and et al., 2009; Das et al., 2010) systems
because they behave as a dynamic system but they
may need some strong consistency (Vogels, 2009) re-
quirements that typical cloud based techniques are
currently unable to offer. Hence, our proposal is to
build a hybrid system which take advantage of both
distributed system schemes, static and dynamic. Fur-
thermore, there are several applications (also referred
to as smart functions) that run over the smart grid,
such as power flow monitoring, under/over voltage
monitoring, load shedding, or fault analysis. Each ap-
plication has its own particular requirements so the
proposed architecture must be flexible enough to sup-
port such variety of functions. Thus, the distributed
storage architecture must provide the following:
Reliability. It must be fully tested since major
changes on it may imply eventual denial of services.
Availability. It also has to ensure that there always
be available data despite its level of consistency.
Fault Tolerance and Recovery. It has to be able to
reconfigure its internal characteristics in order to keep
supplying and storing data in case of failure.
Dynamic Consistency. Smart functions that require
different levels of consistency. For example, on one
hand, data needed to perform a load shedding requires
strong consistency (Vogels, 2009) since it performs
critical operations with the current values of the net-
work. On the other hand, data needed to perform
power monitoring might require a weaker consistency
level since this function tolerates some kind of delay.
Minimum Message Exchange. It is important to
keep a low network overhead in order to guarantee
that there were no bottlenecks, and data will flow over
the network in an efficient way.
To fulfill these requeriments, we propose a dis-
tributed storage architecture built on top of the power
network able to afford the dynamic behavior of smart
grids (e.g., a solar panel may stop supplying energy).
3 SYSTEM ARCHITECTURE
As depicted in Figure 1 a smart grid is seen as a set
of clusters linked by a telecommunications network.
A cluster is composed of up to ten devices placed on
the same geographical area. Each device has limited
Strongest consistency
k 1 consistency
k consistency
Weakest consistency
k 2 consistency
P
13
P
11
P
12
P
21
PP
23
P
22
PP
33
P
32
P
31
P
41
PP
43
P
42
S
51
S
71
S
81
S
61
Computation
Figure 1: Proposed distributed storage system.
storage and computing capabilities and it might not be
able to solve the whole required smart functions on its
own. Smart meters are attached to these devices and
report them their measurements from the smart grid.
Each device in the cluster is labeled as X
ij
where X
corresponds to the device role in the cluster (Primary,
Pseudo-Primary or Secondary); i is the cluster iden-
tifier, and j is the device identifier. In the same way,
we define the ancestor of a cluster
m
as the node X
ij
(that belongs to cluster
i
(m 6= i)) which is updating an
arbitrary pseudo-primaryk of this cluster (PP
mk
). Fig-
ure 1 shows an example where we have that region 2
is formed by devices P
2k
| k = [1, 3] where P
21
is the
primary of this region; P
22
is a common device; P
23
is the pseudo-primary, (that’s why it is named PP
23
);
and, its ancestor is P
11
. Respectively, S
61
is the only
device on region 6 and its ancestor is PP
33
.
Regarding data consistency, we define the replica-
tion depth r as the amount of different clusters that
data are allowed to cross while being replicated. This
value might be dynamically tuned according to the
computation latency or the system performance.
Next, we describe the proposed architecture and
explain how it solves the replication, consistency, and
fault tolerance issues.
3.1 Architecture Overview
Although the number of smart sensors may substan-
tially increase as time goes by, the number of de-
vices that control them should not grow in the same
way. The proposed architecture focuses on the de-
vices instead of the smart meters which is an attempt
to avoid scalability issues from the latter ones by hid-
ing their dynamism. Any device belonging to a cluster
DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE ON SMART GRIDS
219
Definitions:
1. i , Current cluster ID
2. j , Current device ID
3. d , Smart meter ID
4. c , Required consistency level
5. r , Replication depth
I. Upon Smart
meter
ij
(d) generates data
ij
(d)
1. broadcast(cluster
i
, j, data
ij
(d), d)
II. Broadcast delivery (k, data
kl
(d), d)
1. store
data (data
kl
(d), k)
2. if l = i then
r := GetRD(data
kl
(d), d)
if r > 0 then
list := < i, j >
multicast(neighbors
ij
, list, data
kl
(d), r 1)
III. Multicast delivery (list, data
kl
(d), r)
1. store
data (data
kl
(d), last item(list))
2. if r > 0 then
destination := (neighbors
ij
list) \
(neighbors
ij
list)
list := list < i, j >
multicast(destination, list, data
kl
(d), r 1)
IV. Data request (data
kl
(d), c) from source
1. if data
kl
(d) then
unicast(source, nil, 1)
2. else if c GetConsistency(data
kl
(d)) then
unicast(ancestor(data
kl
(d)), data
kl
(d), c)
3. else
unicast(source, data
kl
(d), c)
Figure 2: Replication protocol at smart device
ij
.
may simultaneously adopt different roles according to
the current situation: (1) primary master, (2) primary
slave, (3) pseudo-primary. When a device is propa-
gating data from their directly attached smart meters,
it will act as a primary master and will treat the rest
of devices in its cluster as their primary slaves. When
a device receives data from another cluster it will be
acting as a repeater (pseudo-primary). Blue lines just
illustrate the particular case of P
11
broadcasting data.
3.2 Replication
Replication provides availability and fault tolerance.
However, it increases the number of messages since
all replicas have to be synchronized which reduces the
system throughput. Regarding the time when updates
get propagated to the replicas there exist two major
strategies; (1) eager replication (Bernstein and et al.,
1987) provides strong consistency but poor scalabil-
ity, and (2) lazy replication (Wiesmann and Schiper,
2005) provides higher scalability but has more diffi-
culties to maintain consistency—i.e. replicas may di-
verge. Regarding the amount of sites that update data,
there exist two major replication strategies; (1) active
(Amir and Tutu, 2002) provides strong consistency
since all replicas are synchronized but has low scal-
ability, and (2) passive (Pedone et al., 2000) provides
higher scalability but has some troubles on maintain-
ing strong consistency since all replicas might be un-
synchronized.
As shown in Figure 2, our proposal is a hybrid so-
lution that performs (1) active and eager replication in
the primary-master’s cluster, and (2) passive and lazy
replication in other clusters. This improves the scal-
ability of classical architectures (Jim´enez-Peris et al.,
2002) and defines different consistency regions.
3.3 Consistency
Research on consistency protocols has been con-
ducted for many years and several approaches have
been proposed by the community. There are two ma-
jor alternatives when defining the consistency proper-
ties of a system: (1) strong consistency and (2) weak
consistency. Regarding our proposal, we take advan-
tage of both strong and weak consistency strategies
and propose a hybrid solution inspired by cloud-based
storage and data stream warehouses (DeCandia et al.,
2007; Golab and Johnson, 2011).
In the master’s cluster we implement strong con-
sistency between all replicas. This improves fault tol-
erance since another device of the cluster could easily
take over from a primary-master’s fault. Moreover,
this avoids the typical single point of failure problem.
Once data are strongly consistent in the master’s clus-
ter, devices start propagating them with a time stamp
k to their pseudo-primaries. Therefore, we are cur-
rently implementing an eventually consistent system
between the pseudo-primaries. To sum up, from the
consistency point of view, we have shown how our
hybrid architecture uses both strong and weak (ac-
tually k-weak) consistency techniques. Next we de-
scribe how our architecture deals with fault tolerance.
3.4 Fault Tolerance
Fault tolerance is the ability of the system to recover
from a spontaneous site fault. Distributed systems are
prone to different types of failures (Cristian, 1991).
Since smart grids are hardly dependent on the com-
munication network, we can assume that this channel
will be reliable enough and focus our efforts on the
distributed storage architecture. We also assume that
any site may fail according to the crash model.
Regarding our proposal, there may exist two dif-
ferent failure cases: (1) the failure of a primary, and
(2) the failure of a pseudo-primary. In the former, we
inherit the advantages of the active replication tech-
niques and are able to easily recover since any other
primary-slave can immediately take over the situa-
tion. In the latter, as soon as the ancestor of the failed
node belonging to another cluster, detects its unre-
sponsiveness, it will select a new pseudo-primary.
ICSOFT 2011 - 6th International Conference on Software and Data Technologies
220
4 DISCUSSION & CONCLUSIONS
As shown in the previous sections, our proposal takes
benefit from many techniques used in distributed sys-
tems. However, these techniques have never been put
together and neither tested and need to be discussed:
Master Cluster Reduction. Some members of the
master region might be excluded from the active repli-
cation. Active replication does not scale well (Wies-
mann and Schiper, 2005) and with the proper selec-
tion of representatives we could speed up this process.
Enhance the Takeover Process. A pseudo-primary
could do active replication within its cluster. This role
is not an exclusive one in the cluster, it can be also
responsible for several smart meters and, thus, col-
laborate in the active replication protocol. Recall that
there are not so many nodes in a given cluster.
Failure Detection. Active replication within a
pseudo-primary cluster may (1) enhance the failure
node detection process, and (2) speed up the synchro-
nization of the new device in the replication chain.
Distributed Computing. Our proposed architec-
ture allows to perform distributed computation on the
read steps. Thanks to the fact that required data travel
across the replication chain, each node might be able
perform a piece of the computation required.
Dynamic Replication Depth Tuning. If we were
able to dynamically adjust thisvalue our system might
adapt better to their requirements. Hence, we could
use a cognitivesystem and apply some machine learn-
ing techniques (Mitchell, 1997) in order to (1) evalu-
ate the whole system status and (2) predict the optimal
value of the replication depth for each data item.
In this paper we have defined a way to distribute
and store information across the network so that
the computation needed for smart functions can be
greatly reduced. This work aims to provide some in-
sight into the world of smart grids from a data per-
spective. For the sake of simplicity during the presen-
tation of our system, we have outlined simple scenar-
ios about the replication policy or fault-tolerance is-
sues that need to be treated in detail in further works.
ACKNOWLEDGEMENTS
The research leading to these results has received
funding from the European Union European Atomic
Energy Community Seventh Framework Programme
(FP7/2007-2013 FP7/2007-2011) under grant agree-
ment n 247938 for Joan Navarro and August Cli-
ment and by the Spanish National Science Founda-
tion (MEC) (grant TIN2009-14460-C03-02) for Jos´e
Enrique Armend´ariz-I˜nigo.
REFERENCES
Aguilera, M. K. and et al. (2009). Sinfonia: A new
paradigm for building scalable distributed systems.
ACM Trans. Comput. Syst., 27(3).
Amir, Y. and Tutu, C. (2002). From total order to database
replication. In ICDCS, pages 494–.
Bernstein, P. A. and et al. (1987). Concurrency Control
and Recovery in Database Systems. Addison-Wesley
Longman Publishing Co., Inc., Boston, MA, USA.
Brown, R. E. (2008). Impact of Smart Grid on distribution
system design. In Power and Energy Society General
Meeting - Conversion and Delivery of Electrical En-
ergy in the 21st Century, 2008 IEEE, pages 1–4.
Cristian, F. (1991). Understanding fault-tolerant distributed
systems. Commun. ACM, 34(2):56–78.
Das, S., Agrawal, D., and Abbadi, A. E. (2010). Elastras:
An elastic transactional data store in the cloud. CoRR,
abs/1008.3751.
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati,
G., Lakshman, A., Pilchin, A., Sivasubramanian, S.,
Vosshall, P., and Vogels, W. (2007). Dynamo: ama-
zon’s highly available key-value store. In SOSP, pages
205–220.
Golab, L. and Johnson, T. (2011). Consistency in a Stream
Warehouse. In CIDR.
Jim´enez-Peris, R., Pati˜no-Mart´ınez, M., Kemme, B., and
Alonso, G. (2002). Improving the scalability of fault-
tolerant database clusters. In ICDCS, pages 477–484.
Mitchell, M. (1997). An Introduction to Genetic Algo-
rithms. The MIT Press, Cambridge, Massachusetts.
Palankar, M. R. and et al. (2008). Amazon s3 for science
grids: a viable solution? In DADC ’08: Proceedings
of the 2008 international workshop on Data-aware
distributed computing, pages 55–64, New York, NY,
USA. ACM.
Pati˜no-Mart´ınez, M. and et al. (2005). MIDDLE-R: consis-
tent database replication at the middleware level. ACM
Trans. Comput. Syst., 23(4):375–423.
Paz, A., Perez-Sorrosal, F., Pati˜no-Mart´ınez, M., and
Jim´enez-Peris, R. (2010). Scalability evaluation of the
replication support of jonas, anindustrial j2ee applica-
tion server. In EDCC, pages 55–60.
Pedone, F., Wiesmann, M., Schiper, A., Kemme, B., and
Alonso, G. (2000). Understanding replication in
databases and distributed systems. In ICDCS.
Vogels, W. (2009). Eventually consistent. Commun. ACM,
52(1):40–44.
White, Tom (2009). Hadoop: The Definitive Guide.
O’Reilly Media, 1 edition.
Wiesmann, M. and Schiper, A. (2005). Comparison of
database replication techniques based on total order
broadcast. IEEE Trans. Knowl. Data Eng., 17(4):551–
566.
DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE ON SMART GRIDS
221