DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE

ON SMART GRIDS

Joan Navarro

, Jos´e Enrique Armend´ariz-I˜nigo

and August Climent

Distributed Systems Research Group, La Salle, Ramon Llull University, 08022 Barcelona, Spain

Dpto. Ing. Matem´atica e Inform´atica, Universidad P´ublica de Navarra, 31006 Pamplona, Spain

Keywords:

Dynamic systems, Eventual consistency, Smart grids, Replication.

Abstract:

Most of the power network services such as voltage control, asset management, or ﬂow monitoring are imple-

mented within a centralized paradigm. Novel distributed power generation techniques—eolian ﬁelds or local

solar panels—are decentralizing this generation paradigm and forcing companies to change their traditional

centralized infrastructure. This implies that intelligence and data sources are now spread over the whole net-

work. Smart grids may enable to manage such a change although any standard architecture to deploy them on

a power network exists. This challenges researchers to design and implement a new distributed storage system

able to offer different levels of consistency and replication depending on the physical location of the smart

sensor and according to the network needs. This paper reviews the requirements of smart grids and presents

a new dynamic storage architecture following the ﬂavor of cloud computing. This architecture is based on a

variant of the primary copy scheme and is suitable to store all needed data and enable smart grids to solve the

required functions in a distributed way. Moreover, it is able to offer high scalability and a consistency level

similar to the one required by wireless sensor networks.

1 INTRODUCTION

Power networks are demanded to be high reliable and

available because they have to supply all the infras-

tructures of a country at anytime and anywhere. This

preventspowercompanies from updating and improv-

ing their systems because most of the changes may se-

riously affect critical services they are currently pro-

viding since novel devices might not be as tested as

older ones. This leads to inefﬁcient—due to their

centralized nature—schemes which are expensiveand

even harder to maintain and scale.

With the growth of renewable energies the power

network centralized model not only scales but also

cannot work properly; the aforementioned renew-

able energy sources behave different than traditional

sources. Moreover, current power networks are not

able to remotely monitor power consumptions on the

low voltage (LV) network which prevents companies

from building new business strategies ﬁtted to the end

user needs (Brown, 2008). This situation claims to

a substantial change which consists of decentraliz-

ing the power network and building a distributed sys-

tem able to fulﬁll the current society requirements and

technologies.

Recently, this new paradigm has also been referred

to as smart grid (intelligent grid). The goal of a smart

grid is to take advantage of the current digital tech-

nologies and build up an intelligent information sys-

tem over all devices within the power network: from

suppliers to consumers. This might allow companies

to tune the power distribution and route energy where

and when it is needed.

The purpose of this paper is to focus on the com-

puter engineering ﬁeld and propose a distributed ar-

chitecture able to efﬁciently store and ease the com-

putation of any data generated by the power network.

This distributed storage architecture must be slightly

different than the ones used on web services (Paz

et al., 2010) or in cloud computing based storage

(White, Tom, 2009; Palankar and et al., 2008) since

smart grids demand a set of requirements have not

been explored yet.

2 STORAGE REQUIREMENTS

Smart grids, as opposite to classical power networks,

have become data driven applications since they own

a management layer which takes decisions based on

218

Navarro J., Enrique Armendáriz-Iñigo J. and Climent A..

DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE ON SMART GRIDS.

DOI: 10.5220/0003507702180221

In Proceedings of the 6th International Conference on Software and Database Technologies (ICSOFT-2011), pages 218-221

ISBN: 978-989-8425-76-8

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

the current status of the network. This forces to re-

deﬁne the whole power network architecture and its

speciﬁcations as now there is a need for storing and

processing data besides supplying power. Next we re-

view such needs and state the basics of our proposal.

Smart grids demand a trade off between static

(Pati˜no-Mart´ınez and et al., 2005) and dynamic

(Aguilera and et al., 2009; Das et al., 2010) systems

because they behave as a dynamic system but they

may need some strong consistency (Vogels, 2009) re-

quirements that typical cloud based techniques are

currently unable to offer. Hence, our proposal is to

build a hybrid system which take advantage of both

distributed system schemes, static and dynamic. Fur-

thermore, there are several applications (also referred

to as smart functions) that run over the smart grid,

such as power ﬂow monitoring, under/over voltage

monitoring, load shedding, or fault analysis. Each ap-

plication has its own particular requirements so the

proposed architecture must be ﬂexible enough to sup-

port such variety of functions. Thus, the distributed

storage architecture must provide the following:

Reliability. It must be fully tested since major

changes on it may imply eventual denial of services.

Availability. It also has to ensure that there always

be available data despite its level of consistency.

Fault Tolerance and Recovery. It has to be able to

reconﬁgure its internal characteristics in order to keep

supplying and storing data in case of failure.

Dynamic Consistency. Smart functions that require

different levels of consistency. For example, on one

hand, data needed to perform a load shedding requires

strong consistency (Vogels, 2009) since it performs

critical operations with the current values of the net-

work. On the other hand, data needed to perform

power monitoring might require a weaker consistency

level since this function tolerates some kind of delay.

Minimum Message Exchange. It is important to

keep a low network overhead in order to guarantee

that there were no bottlenecks, and data will ﬂow over

the network in an efﬁcient way.

To fulﬁll these requeriments, we propose a dis-

tributed storage architecture built on top of the power

network able to afford the dynamic behavior of smart

grids (e.g., a solar panel may stop supplying energy).

3 SYSTEM ARCHITECTURE

As depicted in Figure 1 a smart grid is seen as a set

of clusters linked by a telecommunications network.

A cluster is composed of up to ten devices placed on

the same geographical area. Each device has limited

Strongest consistency

k− 1 consistency

k consistency

Weakest consistency

k− 2 consistency

Computation

Figure 1: Proposed distributed storage system.

storage and computing capabilities and it might not be

able to solve the whole required smart functions on its

own. Smart meters are attached to these devices and

report them their measurements from the smart grid.

Each device in the cluster is labeled as X

where X

corresponds to the device role in the cluster (Primary,

Pseudo-Primary or Secondary); i is the cluster iden-

tiﬁer, and j is the device identiﬁer. In the same way,

we deﬁne the ancestor of a cluster

as the node X

(that belongs to cluster

(m 6= i)) which is updating an

arbitrary pseudo-primaryk of this cluster (PP

). Fig-

ure 1 shows an example where we have that region 2

is formed by devices P

| k = [1, 3] where P

is the

primary of this region; P

is a common device; P

is the pseudo-primary, (that’s why it is named PP

);

and, its ancestor is P

. Respectively, S

is the only

device on region 6 and its ancestor is PP

Regarding data consistency, we deﬁne the replica-

tion depth r as the amount of different clusters that

data are allowed to cross while being replicated. This

value might be dynamically tuned according to the

computation latency or the system performance.

Next, we describe the proposed architecture and

explain how it solves the replication, consistency, and

fault tolerance issues.

3.1 Architecture Overview

Although the number of smart sensors may substan-

tially increase as time goes by, the number of de-

vices that control them should not grow in the same

way. The proposed architecture focuses on the de-

vices instead of the smart meters which is an attempt

to avoid scalability issues from the latter ones by hid-

ing their dynamism. Any device belonging to a cluster

DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE ON SMART GRIDS

219

Deﬁnitions:

1. i , Current cluster ID

2. j , Current device ID

3. d , Smart meter ID

4. c , Required consistency level

5. r , Replication depth

I. Upon Smart

meter

(d) generates data

(d)

1. broadcast(cluster

, j, data

(d), d)

II. Broadcast delivery (k, data

(d), d)

1. store

data (data

(d), k)

2. if l = i then

⋆ r := GetRD(data

(d), d)

⋆ if r > 0 then

♦ list := < i, j >

♦ multicast(neighbors

, list, data

(d), r − 1)

III. Multicast delivery (list, data

(d), r)

1. store

data (data

(d), last item(list))

2. if r > 0 then

⋆ destination := (neighbors

∩ list) \

(neighbors

∪ list)

⋆ list := list ∩ < i, j >

⋆ multicast(destination, list, data

(d), r − 1)

IV. Data request (data

(d), c) from source

1. if ∄ data

(d) then

⋆ unicast(source, nil, − 1)

2. else if c ≥ GetConsistency(data

(d)) then

⋆ unicast(ancestor(data

(d)), data

(d), c)

3. else

⋆ unicast(source, data

(d), c)

Figure 2: Replication protocol at smart device

may simultaneously adopt different roles according to

the current situation: (1) primary master, (2) primary

slave, (3) pseudo-primary. When a device is propa-

gating data from their directly attached smart meters,

it will act as a primary master and will treat the rest

of devices in its cluster as their primary slaves. When

a device receives data from another cluster it will be

acting as a repeater (pseudo-primary). Blue lines just

illustrate the particular case of P

broadcasting data.

3.2 Replication

Replication provides availability and fault tolerance.

However, it increases the number of messages since

all replicas have to be synchronized which reduces the

system throughput. Regarding the time when updates

get propagated to the replicas there exist two major

strategies; (1) eager replication (Bernstein and et al.,

1987) provides strong consistency but poor scalabil-

ity, and (2) lazy replication (Wiesmann and Schiper,

2005) provides higher scalability but has more difﬁ-

culties to maintain consistency—i.e. replicas may di-

verge. Regarding the amount of sites that update data,

there exist two major replication strategies; (1) active

(Amir and Tutu, 2002) provides strong consistency

since all replicas are synchronized but has low scal-

ability, and (2) passive (Pedone et al., 2000) provides

higher scalability but has some troubles on maintain-

ing strong consistency since all replicas might be un-

synchronized.

As shown in Figure 2, our proposal is a hybrid so-

lution that performs (1) active and eager replication in

the primary-master’s cluster, and (2) passive and lazy

replication in other clusters. This improves the scal-

ability of classical architectures (Jim´enez-Peris et al.,

2002) and deﬁnes different consistency regions.

3.3 Consistency

Research on consistency protocols has been con-

ducted for many years and several approaches have

been proposed by the community. There are two ma-

jor alternatives when deﬁning the consistency proper-

ties of a system: (1) strong consistency and (2) weak

consistency. Regarding our proposal, we take advan-

tage of both strong and weak consistency strategies

and propose a hybrid solution inspired by cloud-based

storage and data stream warehouses (DeCandia et al.,

2007; Golab and Johnson, 2011).

In the master’s cluster we implement strong con-

sistency between all replicas. This improves fault tol-

erance since another device of the cluster could easily

take over from a primary-master’s fault. Moreover,

this avoids the typical single point of failure problem.

Once data are strongly consistent in the master’s clus-

ter, devices start propagating them with a time stamp

k to their pseudo-primaries. Therefore, we are cur-

rently implementing an eventually consistent system

between the pseudo-primaries. To sum up, from the

consistency point of view, we have shown how our

hybrid architecture uses both strong and weak (ac-

tually k-weak) consistency techniques. Next we de-

scribe how our architecture deals with fault tolerance.

3.4 Fault Tolerance

Fault tolerance is the ability of the system to recover

from a spontaneous site fault. Distributed systems are

prone to different types of failures (Cristian, 1991).

Since smart grids are hardly dependent on the com-

munication network, we can assume that this channel

will be reliable enough and focus our efforts on the

distributed storage architecture. We also assume that

any site may fail according to the crash model.

Regarding our proposal, there may exist two dif-

ferent failure cases: (1) the failure of a primary, and

(2) the failure of a pseudo-primary. In the former, we

inherit the advantages of the active replication tech-

niques and are able to easily recover since any other

primary-slave can immediately take over the situa-

tion. In the latter, as soon as the ancestor of the failed

node belonging to another cluster, detects its unre-

sponsiveness, it will select a new pseudo-primary.

ICSOFT 2011 - 6th International Conference on Software and Data Technologies

220

4 DISCUSSION & CONCLUSIONS

As shown in the previous sections, our proposal takes

beneﬁt from many techniques used in distributed sys-

tems. However, these techniques have never been put

together and neither tested and need to be discussed:

Master Cluster Reduction. Some members of the

master region might be excluded from the active repli-

cation. Active replication does not scale well (Wies-

mann and Schiper, 2005) and with the proper selec-

tion of representatives we could speed up this process.

Enhance the Takeover Process. A pseudo-primary

could do active replication within its cluster. This role

is not an exclusive one in the cluster, it can be also

responsible for several smart meters and, thus, col-

laborate in the active replication protocol. Recall that

there are not so many nodes in a given cluster.

Failure Detection. Active replication within a

pseudo-primary cluster may (1) enhance the failure

node detection process, and (2) speed up the synchro-

nization of the new device in the replication chain.

Distributed Computing. Our proposed architec-

ture allows to perform distributed computation on the

read steps. Thanks to the fact that required data travel

across the replication chain, each node might be able

perform a piece of the computation required.

Dynamic Replication Depth Tuning. If we were

able to dynamically adjust thisvalue our system might

adapt better to their requirements. Hence, we could

use a cognitivesystem and apply some machine learn-

ing techniques (Mitchell, 1997) in order to (1) evalu-

ate the whole system status and (2) predict the optimal

value of the replication depth for each data item.

In this paper we have deﬁned a way to distribute

and store information across the network so that

the computation needed for smart functions can be

greatly reduced. This work aims to provide some in-

sight into the world of smart grids from a data per-

spective. For the sake of simplicity during the presen-

tation of our system, we have outlined simple scenar-

ios about the replication policy or fault-tolerance is-

sues that need to be treated in detail in further works.

ACKNOWLEDGEMENTS

The research leading to these results has received

funding from the European Union European Atomic

Energy Community Seventh Framework Programme

(FP7/2007-2013 FP7/2007-2011) under grant agree-

ment n 247938 for Joan Navarro and August Cli-

ment and by the Spanish National Science Founda-

tion (MEC) (grant TIN2009-14460-C03-02) for Jos´e

Enrique Armend´ariz-I˜nigo.

REFERENCES

Aguilera, M. K. and et al. (2009). Sinfonia: A new

paradigm for building scalable distributed systems.

ACM Trans. Comput. Syst., 27(3).

Amir, Y. and Tutu, C. (2002). From total order to database

replication. In ICDCS, pages 494–.

Bernstein, P. A. and et al. (1987). Concurrency Control

and Recovery in Database Systems. Addison-Wesley

Longman Publishing Co., Inc., Boston, MA, USA.

Brown, R. E. (2008). Impact of Smart Grid on distribution

system design. In Power and Energy Society General

Meeting - Conversion and Delivery of Electrical En-

ergy in the 21st Century, 2008 IEEE, pages 1–4.

Cristian, F. (1991). Understanding fault-tolerant distributed

systems. Commun. ACM, 34(2):56–78.

Das, S., Agrawal, D., and Abbadi, A. E. (2010). Elastras:

An elastic transactional data store in the cloud. CoRR,

abs/1008.3751.

DeCandia, G., Hastorun, D., Jampani, M., Kakulapati,

G., Lakshman, A., Pilchin, A., Sivasubramanian, S.,

Vosshall, P., and Vogels, W. (2007). Dynamo: ama-

zon’s highly available key-value store. In SOSP, pages

205–220.

Golab, L. and Johnson, T. (2011). Consistency in a Stream

Warehouse. In CIDR.

Jim´enez-Peris, R., Pati˜no-Mart´ınez, M., Kemme, B., and

Alonso, G. (2002). Improving the scalability of fault-

tolerant database clusters. In ICDCS, pages 477–484.

Mitchell, M. (1997). An Introduction to Genetic Algo-

rithms. The MIT Press, Cambridge, Massachusetts.

Palankar, M. R. and et al. (2008). Amazon s3 for science

grids: a viable solution? In DADC ’08: Proceedings

of the 2008 international workshop on Data-aware

distributed computing, pages 55–64, New York, NY,

USA. ACM.

Pati˜no-Mart´ınez, M. and et al. (2005). MIDDLE-R: consis-

tent database replication at the middleware level. ACM

Trans. Comput. Syst., 23(4):375–423.

Paz, A., Perez-Sorrosal, F., Pati˜no-Mart´ınez, M., and

Jim´enez-Peris, R. (2010). Scalability evaluation of the

replication support of jonas, anindustrial j2ee applica-

tion server. In EDCC, pages 55–60.

Pedone, F., Wiesmann, M., Schiper, A., Kemme, B., and

Alonso, G. (2000). Understanding replication in

databases and distributed systems. In ICDCS.

Vogels, W. (2009). Eventually consistent. Commun. ACM,

52(1):40–44.

White, Tom (2009). Hadoop: The Deﬁnitive Guide.

O’Reilly Media, 1 edition.

Wiesmann, M. and Schiper, A. (2005). Comparison of

database replication techniques based on total order

broadcast. IEEE Trans. Knowl. Data Eng., 17(4):551–

566.

DYNAMIC DISTRIBUTED STORAGE ARCHITECTURE ON SMART GRIDS

221