Lazy Agents for Large Scale Global Optimization

Joerg Bremer

and Sebastian Lehnhoff

Department of Computing Science, University of Oldenburg, Uhlhornsweg, Oldenburg, Germany

R&D Division Energy, OFFIS – Institute for Information Technology, Escherweg, Oldenburg, Germany

Keywords:

Global Optimization, Distributed Optimization, Multi-agent Systems, Lazy Agents, Coordinate Descent

Optimization.

Abstract:

Optimization problems with rugged, multi-modal Fitness landscapes, non-linear problems, and derivative-free

optimization entails challenges to heuristics especially in the high-dimensional case. High-dimensionality

also tightens the problem of premature convergence and leads to an exponential increase in search space

size. Parallelization for acceleration often involves domain speciﬁc knowledge for data domain partition or

functional or algorithmic decomposition. We extend a fully decentralized agent-based approach for a global

optimization algorithm based on coordinate descent and gossiping that has no speciﬁc decomposition needs

and can thus be applied to arbitrary optimization problems. Originally, the agent method suffers from likely

getting stuck in high-dimensional problems. We extend a laziness mechanism that lets the agents randomly

postpone actions of local optimization and achieve a better avoidance of stagnation in local optima. The

extension is tested against the original method as well as against established methods. The lazy agent approach

turns out to be competitive and often superior in many cases.

1 INTRODUCTION

Global optimization of non-convex, non-linear prob-

lems has long been subject to research (B

ack et al.,

1997; Horst and Pardalos, 1995). Approaches can

roughly be classiﬁed into deterministic and proba-

bilistic methods. Deterministic approaches like inter-

val methods (Hansen, 1980), Cutting Plane methods

(Tuy et al., 1985), or Lipschitzian methods (Hansen

et al., 1992) often suffer from intractability of the

problem or getting stuck in local optima (Simon,

2013). In case of a rugged ﬁtness landscape of multi-

modal, non-linear functions, probabilistic heuristics

are indispensable. Often derivative free methods are

needed, too.

Many optimization approaches have so far been

proposed for solving these problems; among them

are evolutionary methods or swarm-based methods

ack et al., 1997; Dorigo and St

utzle, 2004; Simon,

2013; Hansen, 2006; Kennedy and Eberhart, 1995;

Storn and Price, 1997). In (Bremer and Lehnhoff,

2017a), an agent-based methods has been proposed

with the advantaged of good scaling properties as with

each new objective dimension an agent is added lo-

cally searching along the respective dimension (Bre-

mer and Lehnhoff, 2017a). The approach uses the

COHDA protocol (Hinrichs et al., 2013). In this ap-

proach, the agents perform a decentralized block co-

ordinate descent (Wright, 2015) and self-organized

aggregate locally found optima to an overall solution.

In (Hinrichs and Sonnenschein, 2014; Anders

et al., 2012), the effect of communication delays in

message sending and the degree of variation in such

agent systems on the solution quality has been scru-

tinized. Increasing variation (agents with different

knowledge interact) leads to better results. An in-

crease in inter-agent variation can also be achieved

by letting agents delay individual decisions. Hence,

we combine the ideas from (Bremer and Lehnhoff,

2017a) and (Hinrichs and Sonnenschein, 2014) and

extend the agent approach to global optimization by

integrating a decision delay into the agents. In this

way, the agents sort of behave lazy with regard to their

decision duty.

Agents in the COHDA protocol act after the

receive-decide-act metaphor (Hinrichs et al., 2013).

When applied to local optimization, the decide pro-

cess decides locally on the best parameter position

with regard to just one respective dimension of the

objective function. Thus, the agent performs a 1-

dimensional optimization along an intersection of the

objective function and takes the other dimensions (his

Bremer, J. and Lehnhoff, S.

Lazy Agents for Large Scale Global Optimization.

DOI: 10.5220/0007571600720079

In Proceedings of the 11th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2019), pages 72-79

ISBN: 978-989-758-350-6

belief on the other agent’s local optimizations) as

ﬁxed for the moment. We extend this approach by a

mechanism that postpones the decision process. Thus

the agent gathers more information from other agents

(including transient ones with more communication

hops) and may decide on a more solid basis.

The rest of the paper is organized as follows. Af-

ter a brief recap of (large scale) global optimization,

heuristics, and the agent approach for solving, the ex-

tension of laziness to the agents is explained. The

effectiveness is demonstrated by comparing with the

original approach and with standard algorithms.

2 RELATED WORK

Global optimization comprises many problems in

practice as well as in the scientiﬁc community. These

problems are often hallmarked by presence of a

rugged ﬁtness landscape with many local optima and

non-linearity. Thus optimization algorithms are likely

to become stuck in local optima and guaranteeing the

exact optimum is often intractable; leading to the use

of heuristics.

Evolution Strategies (Rechenberg, 1965) for ex-

ample have shown excellent performance in global

optimization especially when it comes to complex

multi-modal, high-dimensional, real valued problems

(Kramer, 2010; Ulmer et al., 2003). Each of these

strategies has its own characteristics, strengths and

weaknesses. A common characteristic is the genera-

tion of an offspring solution set by exploring the char-

acteristics of the objective function in the immediate

neighborhood of an existing set of solutions. When

the solution space is hard to explore or objective eval-

uations are costly, computational effort is a common

drawback for all population-based schemes. Real

world problems often face additional computational

efforts for ﬁtness evaluations; e. g. in Smart Grid load

planning scenarios, ﬁtness evaluation involves simu-

lating a large number of energy resources and their

behaviour (Bremer and Sonnenschein, 2014).

Especially in high-dimensional problems, prema-

ture convergence (Leung et al., 1997; Trelea, 2003;

Rudolph, 2001) entails additional challenges onto the

used optimization method. Heuristics often converge

too early towards a sub-optimal solution and then get

stuck in this local optimum. This might for instance

happen if an adaption strategy decreases the mutation

range and thus the range of the currently searched sur-

rounding sub-region and possible ways out of a cur-

rent trough are no longer scrutinized.

On the other hand, much effort has been spent

to accelerate convergence of these methods. Ex-

ample techniques are: improved population initial-

ization (Rahnamayan et al., 2007), adaptive popula-

tions sizes(Ahrari and Shariat-Panahi, 2015) or ex-

ploiting sub-populations (Rigling and Moore, 1999).

Sometimes a surrogate model is used in case of com-

putational expensive objective functions (Loshchilov

et al., 2012) to substitute a share of objective function

evaluations with cheap surrogate model evaluations.

The surrogate model represents a learned model of

the original objective function. Recent approaches

use Radial Basis Functions, Polynomial Regression,

Support Vector Regression, Artiﬁcial Neural Network

or Kriging (Gano et al., 2006); each approach with in-

dividual advantages and drawbacks.

Recently, the number of large scale global op-

timizations problems grows as technology advances

(Li et al., 2013). Large scale problems are dif-

ﬁcult to solve for several reasons (Weise et al.,

2012). The main reasons are the exponentially grow-

ing search space and a potential change of an ob-

jective function’s properties (Li et al., 2013; Weise

et al., 2012; Shang and Qiu, 2006). Moreover, eval-

uating large scale objectives is expensive, especially

in real world problems (Sobieszczanski-Sobieski and

Haftka, 1997). Growing non-separability or variable

interaction sometimes entail further challenges (Li

et al., 2013).

For faster execution, different approaches for par-

allel problem solving have been scrutinized in the

past; partly with a need for problem speciﬁc adaption

for distribution. Four main questions deﬁne the de-

sign decisions for distributing a heuristic: which in-

formation to exchange, when to communicate, who

communicates, and how to integrate received infor-

mation (Nieße, 2015; Talbi, 2009). Examples for

traditional meta-heuristics that are available as dis-

tributed version are: Particle swarm (Vanneschi et al.,

2011), ant colony (Colorni et al., 1991), or parallel

tempering (Li et al., 2009). Distribution for gaining

higher solution accuracy is a rather rare use case. An

example is given in (Bremer and Lehnhoff, 2016).

Another class of algorithms for global optimiza-

tion that has been popular for many years by prac-

titioners rather than scientists (Wright, 2015) is that

of coordinate descent algorithms (Ortega and Rhein-

boldt, 1970). Coordinate descent algorithms itera-

tively search for the optimum in high dimensional

problems by ﬁxing most of the parameters (compo-

nents of variable vector x

x) and doing a line search

along a single free coordinate axis. Usually, all com-

ponents of x

x a cyclically chosen for approximating the

objective with respect to the (ﬁxed) other components

(Wright, 2015). In each iteration, only a lower dimen-

sional or even scalar sub-problem has to be solved.

Lazy Agents for Large Scale Global Optimization

The multi-variable objective f (x

x) is solved by looking

for the minimum in one direction at a time. There

are several approaches for choosing the step size for

the step towards the local minimum, but as long as

the sequence f (x

), f (x

),... , f (x

) is monotonically

decreasing the method converges to an at least local

optimum. Like any other gradient based method this

approach gets easily stuck in case of a non-convex

objective function.

In (Hinrichs et al., 2013) an agent based approach

has been proposed as an algorithmic level decomposi-

tion scheme for decentralized problem solving (Talbi,

2009; Hinrichs et al., 2011), making it especially suit-

able for large scale problems.

Each agent is responsible for one dimension of

the objective function. The intermediate solutions for

other dimensions (represented by decisions published

by other agents) are regarded as temporarily ﬁxed.

Thus, each agent only searches along a 1-dimensional

cross-section of the objective and thus has to solve

merely a simpliﬁed sub-problem. Nevertheless, for

evaluation of the solution, the full objective function

is used. In this way, the approach achieves an asyn-

chronous coordinate descent with the ability to escape

local minima by parallel searching different regions of

the search space. The approach uses as basis a proto-

col from (Hinrichs et al., 2013).

In (Hinrichs et al., 2013) a fully decentralized

agent-based approach for combinatorial optimization

problems has been introduced. Originally, the combi-

natorial optimization heuristics for distributed agents

(COHDA) had been invented to solve the problem of

predictive scheduling (Sonnenschein et al., 2014) in

the Smart Grid.

The key concept of COHDA is an asynchronous

iterative approximate best-response behavior, where

each participating agent – originally representing a

decentralized energy unit – reacts to updated informa-

tion from other agents by adapting its own action (se-

lect an energy production scheme that enables group

of energy generators to fulﬁl an energy product from

market as good as possible). All agents a

∈ A initially

only know their own respective search space S

of fea-

sible energy schedules that can be operated by the

own energy resource. From an algorithmic point of

view, the difﬁculty of the problem is given by the dis-

tributed nature of the system in contrast to the task of

ﬁnding a common allocation of schedules for a global

target power proﬁle.

Thus, the agents coordinate by updating and ex-

changing information about each other. For privacy

and communication overhead reasons, the potential

ﬂexibility (alternative actions) is not communicated

as a whole by an agent. Instead, the agents communi-

cate single selected local solutions (energy production

schedules in the Smart Grid case) within the approach

as described in the following.

First of all, the agents are placed in an artiﬁcial

communication topology based on the small-world

scheme, (e. g. a small world topology (Watts and Stro-

gatz, 1998), such that each agent is connected to a

non-empty subset of other agents. This overlay topol-

ogy might be a ring in the least connected variant.

Each agent collects two distinct sets of informa-

tion: on the one hand the believed current conﬁgu-

ration γ

of the system (that is, the most up to date

information a

has about currently selected schedules

of all agents), and on the other hand the best known

combination γ

∗

of schedules with respect to the global

objective function it has encountered so far.

Beginning with an arbitrarily chosen agent by

passing it a message containing only the global ob-

jective (i. e. the target power proﬁle), each agent re-

peatedly executes the three steps perceive, decide, act

(cf. (Nieße et al., 2014)):

Algorithm 1: Basic scheme of an agent’s decision on local

optima in the extension of COHDA to global optimization.

1: // let x

x ∈ R

an intermediate solution

2: x

←

(

if x

∈ K

x ∼ U(x

min

max

) else

∀k 6= j

3: // solve with Brent optimizer:

4: x

← argmin f

(x) = f (x,x

x) =

f (x

,.. . , x

j−1

,x, x

j+1

,.. . , x

)

5: if f (x

x) < f (x

old

) then

6: update workspace K

7: end if

1. perceive: When an agent a

receives a mes-

sage κ

from one of its neighbors (say, a

), it

imports the contents of this message into its own

memory.

2. decide: The agent then searches S

for the best

own local solution regarding the updated system

state γ

and the global objective function. Local

constraints are taken into account in advance if

applicable. Details regarding this procedure have

been presented in (Nieße et al., 2016). If a local

solution can be found that satisﬁes the objective, a

new solution selection is created. For the follow-

ing comparison, only the global objective function

must be taken into account: If the resulting mod-

iﬁed system state γ

yields a better rating than the

current solution candidate γ

∗

, a new solution can-

didate is created based on γ

. Otherwise the old

solution candidate still reﬂects the best combina-

tion regarding the global objective, so the agent

reverts to its old selection stored in γ

∗

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

3. act: If γ

or γ

∗

has been modiﬁed in one of the

previous steps, the agent ﬁnally broadcasts these

to its immediate neighbors in the communication

topology.

During this process, for each agent a

, its observed

system conﬁguration γ

as well as solution candidate

∗

are ﬁlled successively. After producing some in-

termediate solutions, the heuristic eventually termi-

nates in a state where for all agents γ

as well as γ

∗

are identical, and no more messages are produced by

the agents. At this point, γ

∗

is the ﬁnal solution of the

heuristic and contains exactly one schedule selection

for each agent.

The COHDA protocol has meanwhile been ap-

plied to many different optimization problems (Bre-

mer and Lehnhoff, 2017b; Bremer and Lehnhoff,

2017c). In (Bremer and Lehnhoff, 2017a) COHDA

has also been applied to the continuous problem of

global optimization.

3 LAZY COHDAGO

In (Bremer and Lehnhoff, 2017a) the COHDA pro-

tocol has been applied to global optimization (CO-

HDAgo). Each agent is responsible for solving one

dimension x

of a high-dimensional function f (x

x) as

global objective. Each time an agent receives a mes-

sage from one of its neighbors, the own knowledge-

base with assumptions about optimal coordinates x

∗

of the optimum of f (with x

∗

= argmin f (x

x)) is up-

dated. Let a

be the agent that just has received a

message from agent a

. Then, the workspace K

agent a

is merged with information from the received

workspace K

. Each workspace K of an agent con-

tains a set of coordinates x

such that x

reﬂects the

kth coordinate of the current solution x

x so far found

from agent a

. Additionally, information about other

coordinates x

,.. . , x

reﬂecting local decisions of

,.. . , a

that a

has received messages from is also

integrated into K

if the information is newer or out-

dates the already known. Thus each agent gathers also

transient information; ﬁnally about all local decisions.

In general, each coordinate x

that is not yet

in K

is temporarily set to a random value x

∼

U(x

min

max

) for objective evaluation. W.l.o.g. all

unknown values could also be set to zero. But, as

many of the standard benchmark objective function

have their optimum at zero, this would result in an un-

fair comparison as such behavior would unintention-

ally induce some priori knowledge. Thus, we have

chosen to initialize unknown values with a random

value.

perceive:

update knowledge

act:

send workspace

𝑎

𝑖

𝑗

integrate

𝑗

subset of 1-

dimensinal

solutions

decide:

optimize sub-problem

𝑎

𝑗

Figure 1: Internal receive-decide-act architecture of an

agent with decision process. The agent receives a set of op-

timum coordinates from another agent, decides on the best

coordinate for the dimensions the agent accounts for and

sends the updated information to all neighbors; cf. (Bremer

and Lehnhoff, 2017a).

perceive:

update knowledge

act:

send workspace





























integrate





subset of 1-

dimensinal

solutions

decide:

optimize sub-problem





    

postpone





Figure 2: Extended agent protocol for integrating laziness

into the protocol from Figure 1.

After the update procedure, agent a

takes all el-

ements x

∈ x

x with k 6= j as temporarily ﬁxed and

starts solving a 1-dimensional sub-problem: x

argmin f (x,x

x); where f is the objective function with

all values except element x

ﬁxed. This problem with

only x as the single degree of freedom is solved using

Brent’s method (Brent, 1971). Algorithm 1 summa-

rizes this approach.

Brent’s method originally is a root ﬁnding pro-

cedure that combines the previously known bisec-

tion method and the secant method with an inverse

quadratic interpolation. Whereas the latter are known

Lazy Agents for Large Scale Global Optimization

for fast convergence, bisection provides more relia-

bility. By combining these methods – a ﬁrst step was

already undertaken by (Dekker, 1969) – convergence

can be guaranteed with at most O(n

) iterations (with

n iterations for the bisection method). In case of a

well-behaved function the method converges even su-

perlinearly (Brent, 1971). We used an evaluated im-

plementation from Apache Commons Math after a

reference implementation from (Brent, 1973).

After x

has been determined with Brent’s method,

is communicated (along with all x

previously re-

ceived from agent a

) to all neighbors if f (x

∗

) with x

gains a better result than the previous solution candi-

date. Figure 1 summarizes this procedure.

Into this agent process, we integrated laziness.

Figure 2 shows the idea. As an additional stage in

the receive-decide-act protocol, a random decision is

made whether to postpone a decision on local opti-

mality based on aggregated information. In contrast

to the approach of (Anders et al., 2012), aggrega-

tion is nevertheless done with this additional stage.

Only after information aggregation and thus after be-

lief update it is randomly decided whether to continue

with the decision process of the current belief (local

optimization of the respective objective dimension)

or with postponing this process. By doing so, addi-

tional information – either update information from

the same agent, or additional information from other

agents – may meanwhile arrive and aggregate. The

delay is realized by putting the trigger message in

a holding stack and resubmitting it later. Figure 3

shows the relative frequencies of delay (additional ag-

gregation steps) that occur when a uniform distribu-

tion U(0, 1) is used for deciding on postponement.

The likelihood of being postponed is denoted by λ. In

this way, information may also take over newer infor-

mation and thus may trigger a resumption at an older

search branch that led to a dead-end. In general, the

disturbance within the system increases, and thus pre-

mature convergence is better prevented. We denote

this extension lazyCOHDAgo.

4 RESULTS

For evaluation, we used a set of well-known test func-

tions that have been developed for benchmarking op-

timization methods: Elliptic, Ackley (Ulmer et al.,

2003), Egg Holder (Jamil and Yang, 2013), Rastri-

gin (Aggarwal and Goswami, 2014), Griewank (Lo-

catelli, 2003), Quadric (Jamil and Yang, 2013), and

examples from the CEC ’13 Workshop on Large Scale

Optimization (Li et al., 2013).

In a ﬁrst experiment, we tested the effect of lazy

0 10 20 30 40

0.2

0.4

0.6

delay

rel. frequency

λ = 0.3

λ = 0.6

λ = 0.9

Figure 3: Probability density of postponement delay for dif-

ferent laziness factors λ denoting the probability of post-

poning an agent’s decision process.

agents and solved a set of test functions with agents

of different laziness λ. Table 1 shows the result for

50-dimensional versions of the test functions. In this

rather low dimensional cases the effect is visible, but

not that prominent. In most cases a slight improve-

ment can be seen with growing laziness factor (λ = 0

denotes no laziness at all and thus responds to the

original COHDAgo). The Elliptic function for ex-

ample shows no improvement. In some cases, e. g.

for the Quadric function the result quality deterio-

rates. But, also an overshoot can be observed with

the Griewank function where the best result is ob-

tained with a laziness of λ = 0.3. When applied

to more complex and higher-dimensional objective

functions the effect is way more prominent as can be

seen in Table 2. The CEC f

function (Li et al., 2013)

is a shifted elliptic function which is ill-conditioned

with condition number ≈ 10

in the 1000-dimensional

case. Due to dimensionality these results have also

been obtained with a laziness of λ = 0.99) From the

wide range of solution qualities for λ = 0.9 – the

achieved minimum result out of 20 runs was (200-

dimensional case) 3.40 × 10

−19

, which is almost as

good as the result for λ = 0.9 – it can be concluded

that the agent system is less susceptible to prema-

ture convergence and thus yields better mean results.

The Rosenbrock function is a asymmetrically, non-

linearly shifted version of (Rosenbrock, 1960) multi-

plied by the Alpine function.

Finally, we compared the results of the lazy agent

approach with other established meta-heuristics for

functions where the agent approach was successful.

Please note that for some function (e. g. . the result in

table 1) were not that promising. For comparison we

used the co-variance matrix adaption evolution strat-

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

Table 1: Performance of the lazy agent approach on different 50-dimensional test functions for different laziness factors λ.

function λ = 0.0 λ = 0.3 λ = 0.6 λ = 0.9

Elliptic 1.527 × 10

−21

± 2.876 × 10

−28

1.527 × 10

−21

± 1.976 × 10

−28

1.527 × 10

−1

± 2.594 × 10

−29

1.527 × 10

−21

± 7.48 × 10

−28

Ackley 1.306 × 10

± 2.988 × 10

−1

1.217 × 10

± 1.665 × 10

−1

1.205 × 10

± 1.86 × 10

−1

1.124 × 10

± 2.088 × 10

−1

EggHolder 1.453 × 10

± 8.639 × 10

1.423 × 10

± 8.81 × 10

1.384 × 10

± 9.119 × 10

1.345 × 10

± 9.441 × 10

Rastrigin 2.868 × 10

± 2.493 × 10

2.87 × 10

± 1.569 × 10

2.868 × 10

± 2.427 × 10

2.858 × 10

± 3.088 × 10

Griewank 2.95 × 10

−3

± 9.328 × 10

−3

1.478 × 10

−3

± 4.674 × 10

−3

1.59 × 10

−3

± 3.219 × 10

−2

3.07 × 10

−2

± 4.132 × 10

−2

Quadric 6.51 × 10

−26

± 6.525 × 10

−26

1.196 × 10

−25

± 8.128 × 10

−26

3.65 × 10

−5

± 5.141 × 10

−25

4.43 × 10

−15

± 1.40 × 10

−14

Table 2: performance of the lazy agent approach on different high-dimensional, ill-conditioned test functions for different

laziness factors λ.

function λ = 0.0 λ = 0.9 λ = 0.99

CEC f

, d = 200 1.81 × 10

± 5.78 × 10

2.20 × 10

± 4.11 × 10

3.40 × 10

−19

± 1.54 × 10

−23

CEC f

, d = 500 4.28 × 10

± 8.28 × 10

6.55 × 10

± 1.85 × 10

3.760 × 10

−19

± 1.31 × 10

−21

Rosenbrock

∗

d = 250 1.01 × 10

−5

± 1.71 × 10

−5

2.41 × 10

−7

± 5.37 × 10

−7

5.68 × 10

−8

± 1.60 × 10

−7

Table 3: Comparison of the lazy agent approach with different established meta-heuristics.

f CMA-ES DE PSO lazy COHDAgo

Elliptic 3.41 × 10

−5

± 7.47 × 10

−5

4.48 × 10

−9

± 2.24 × 10

−9

2.65 × 10

± 8.37 × 10

1.14 × 10

−21

± 2.64 × 10

−27

Ackley 1.02 × 10

± 7.07 × 10

4.73 × 10

−2

± 7.06 × 10

−5

2.0 × 10

± 0.0 × 10

1.54 × 10

± 1.01 × 10

−1

Alpine 4.2 × 10

± 3.96 × 10

2.82 × 10

−3

± 9.18 × 10

−5

6.61 × 10

−9

± 1.29 × 10

−8

4.51 × 10

−12

± 1.32 × 10

−13

Griewank 9.99 × 10

−4

± 3.11 × 10

−3

4.41 × 10

−4

± 1.51 × 10

−5

8.92 × 10

−3

± 4.54 × 10

−3

5.11 × 10

−16

± 9.2 × 10

−16

Table 4: Respective best results (residual error) out of 20 runs each for the comparison from Table 3.

f CMA-ES DE PSO lazy COHDAgo

Elliptic 9.28 × 10

−7

1.89 × 10

−9

1.04 × 10

−4

1.14 × 10

−21

Ackley 6.02 × 10

−6

4.73 × 10

−2

2.0 × 10

1.52 × 10

Alpine 1.28 × 10

−1

2.7 × 10

−3

1.2 × 10

−15

4.3 × 10

−12

Griewank 3.57 × 10

−6

4.23 × 10

−4

1.51 × 10

−5

0.0 × 10

egy (CMA-ES) from (Hansen and Ostermeier, 2001)

with a parametrization after (Hansen, 2011), Differ-

ential Evolution (Storn and Price, 1997) and Particle

Swarm Optimization (Kennedy and Eberhart, 1995).

The lazy COHDAgo approach has been parametrized

with a laziness of λ = 0.9. Table 3 shows the result.

As the agent approach terminates by itself if no

further solution improvement can be made by any

agent and no further stopping criterion is meaningful

in an asynchronously working decentralized system,

we simply logged the number of used function eval-

uations and gave this number as evaluation budget to

the other heuristics. In this way we ensured that every

heuristics uses the same budget of maximum objec-

tive evaluations. As CMA-ES was not able to succeed

for some high-dimensional functions with this limited

budget, this evolution strategy was given the 100 fold

budget.

The agent approach is competitive for the Ackley

function. In most of the cases lazyCOHDAgo suc-

ceeds in terms of residual error, but also, when look-

ing at the absolute best solution out of 20 run each

(Table 4), the lazy agent-approach is successful.

5 CONCLUSION

Large scale global optimization is a crucial task for

many real world applications in industry and en-

gineering. Most meta-heuristics deteriorate rapidly

with growing problem dimensionality. We proposed

a laziness extension to an agent-based algorithm for

global optimization and achieved a way better perfor-

mance when applied to large scale problems. By ran-

domly postponing the agent’s decision on local opti-

mization leads to less vulnerability to premature con-

vergence, obviously due to an increasing inter-agent

variation (Anders et al., 2012) and thus to the incor-

poration of past (outdated) information. This may re-

stimulate search in already abandoned paths. Delay-

ing the reaction of the agents in COHDA is known to

increase the diversity in the population and thus lead-

ing to at least equally good results but with a larger

number of steps (Hinrichs and Sonnenschein, 2014),

but for some use cases – like large scale global opti-

mization also to better results.

The lazy COHDAgo approach has shown good

and sometimes superior performance especially re-

Lazy Agents for Large Scale Global Optimization

garding solution quality. In future work, it may also

be promising to further scrutinize the impact of the

communication topology as design parameter.

REFERENCES

Aggarwal, S. and Goswami, P. (2014). Implementation of

dejong function by various selection method and ana-

lyze their performance. IJRCCT, 3(6).

Ahrari, A. and Shariat-Panahi, M. (2015). An improved

evolution strategy with adaptive population size. Op-

timization, 64(12):2567–2586.

Anders, G., Hinrichs, C., Siefert, F., Behrmann, P., Reif,

W., and Sonnenschein, M. (2012). On the Inﬂuence

of Inter-Agent Variation on Multi-Agent Algorithms

Solving a Dynamic Task Allocation Problem under

Uncertainty. In Sixth IEEE International Conference

on Self-Adaptive and Self-Organizing Systems (SASO

2012), pages 29–38, Lyon, France. IEEE Computer

Society. (Best Paper Award).

ack, T., Fogel, D. B., and Michalewicz, Z., editors (1997).

Handbook of Evolutionary Computation. IOP Pub-

lishing Ltd., Bristol, UK, UK, 1st edition.

Bremer, J. and Lehnhoff, S. (2016). A decentralized

PSO with decoder for scheduling distributed electric-

ity generation. In Squillero, G. and Burelli, P., editors,

Applications of Evolutionary Computation: 19th Eu-

ropean Conference EvoApplications (1), volume 9597

of Lecture Notes in Computer Science, pages 427–

442, Porto, Portugal. Springer.

Bremer, J. and Lehnhoff, S. (2017a). An agent-based ap-

proach to decentralized global optimization: Adapt-

ing cohda to coordinate descent. In van den Herik,

J., Rocha, A., and Filipe, J., editors, ICAART 2017

- Proceedings of the 9th International Conference on

Agents and Artiﬁcial Intelligence, volume 1, pages

129–136, Porto, Portugal. SciTePress, Science and

Technology Publications, Lda.

Bremer, J. and Lehnhoff, S. (2017b). Decentralized coali-

tion formation with agent-based combinatorial heuris-

tics. ADCAIJ: Advances in Distributed Computing

and Artiﬁcial Intelligence Journal, 6(3).

Bremer, J. and Lehnhoff, S. (2017c). Hybrid Multi-

ensemble Scheduling, pages 342–358. Springer Inter-

national Publishing, Cham.

Bremer, J. and Sonnenschein, M. (2014). Parallel tempering

for constrained many criteria optimization in dynamic

virtual power plants. In Computational Intelligence

Applications in Smart Grid (CIASG), 2014 IEEE Sym-

posium on, pages 1–8.

Brent, R. (1973). Algorithms for Minimization Without

Derivatives. Dover Books on Mathematics. Dover

Publications.

Brent, R. P. (1971). An algorithm with guaranteed conver-

gence for ﬁnding a zero of a function. Comput. J.,

14(4):422–425.

Colorni, A., Dorigo, M., Maniezzo, V., et al. (1991). Dis-

tributed optimization by ant colonies. In Proceedings

of the ﬁrst European conference on artiﬁcial life, vol-

ume 142, pages 134–142. Paris, France.

Dekker, T. (1969). Finding a zero by means of successive

linear interpolation. Constructive aspects of the fun-

damental theorem of algebra, pages 37–51.

Dorigo, M. and St

utzle, T. (2004). Ant Colony Optimiza-

tion. Bradford Company, Scituate, MA, USA.

Gano, S. E., Kim, H., and Brown II, D. E. (2006). Compar-

ison of three surrogate modeling techniques: Datas-

cape, kriging, and second order regression. In Pro-

ceedings of the 11th AIAA/ISSMO Multidisciplinary

Analysis and Optimization Conference, AIAA-2006-

7048, Portsmouth, Virginia.

Hansen, E. (1980). Global optimization using interval anal-

ysis – the multi-dimensional case. Numer. Math.,

34(3):247–270.

Hansen, N. (2006). The CMA evolution strategy: a compar-

ing review. In Lozano, J., Larranaga, P., Inza, I., and

Bengoetxea, E., editors, Towards a new evolutionary

computation. Advances on estimation of distribution

algorithms, pages 75–102. Springer.

Hansen, N. (2011). The CMA Evolution Strategy: A Tuto-

rial. Technical report.

Hansen, N. and Ostermeier, A. (2001). Completely deran-

domized self-adaptation in evolution strategies. Evol.

Comput., 9(2):159–195.

Hansen, P., Jaumard, B., and Lu, S.-H. (1992). Global op-

timization of univariate lipschitz functions ii: New al-

gorithms and computational comparison. Math. Pro-

gram., 55(3):273–292.

Hinrichs, C. and Sonnenschein, M. (2014). The Effects

of Variation on Solving a Combinatorial Optimiza-

tion Problem in Collaborative Multi-Agent Systems.

In Mller, J. P., Weyrich, M., and Bazzan, A. L., edi-

tors, Multiagent System Technologies, volume 8732 of

Lecture Notes in Computer Science, pages 170–187.

Springer International Publishing.

Hinrichs, C., Sonnenschein, M., and Lehnhoff, S. (2013).

Evaluation of a Self-Organizing Heuristic for Inter-

dependent Distributed Search Spaces. In Filipe, J.

and Fred, A. L. N., editors, International Conference

on Agents and Artiﬁcial Intelligence (ICAART 2013),

volume Volume 1 – Agents, pages 25–34. SciTePress.

Hinrichs, C., Vogel, U., and Sonnenschein, M. (2011). Ap-

proaching decentralized demand side management via

self-organizing agents. Workshop.

Horst, R. and Pardalos, P. M., editors (1995). Handbook of

Global Optimization. Kluwer Academic Publishers,

Dordrecht, Netherlands.

Jamil, M. and Yang, X. (2013). A literature survey of

benchmark functions for global optimization prob-

lems. CoRR, abs/1308.4008.

Kennedy, J. and Eberhart, R. (1995). Particle swarm op-

timization. In Neural Networks, 1995. Proceedings.,

IEEE International Conference on, volume 4, pages

1942–1948 vol.4. IEEE.

Kramer, O. (2010). A review of constraint-handling tech-

niques for evolution strategies. Appl. Comp. Intell.

Soft Comput., 2010:1–19.

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

Leung, Y., Gao, Y., and Xu, Z.-B. (1997). Degree of pop-

ulation diversity - a perspective on premature con-

vergence in genetic algorithms and its markov chain

analysis. IEEE Transactions on Neural Networks,

8(5):1165–1176.

Li, X., Tang, K., Omidvar, M. N., Yang, Z., and Qin, K.

(2013). Benchmark functions for the cec2013 spe-

cial session and competition on large-scale global op-

timization. techical report.

Li, Y., Mascagni, M., and Gorin, A. (2009). A decentralized

parallel implementation for parallel tempering algo-

rithm. Parallel Computing, 35(5):269–283.

Locatelli, M. (2003). A note on the griewank test function.

Journal of Global Optimization, 25(2):169–174.

Loshchilov, I., Schoenauer, M., and Sebag, M. (2012). Self-

adaptive surrogate-assisted covariance matrix adapta-

tion evolution strategy. CoRR, abs/1204.2356.

Nieße, A. (2015). Verteilte kontinuierliche Einsatzplanung

in Dynamischen Virtuellen Kraftwerken. PhD thesis.

Nieße, A., Beer, S., Bremer, J., Hinrichs, C., L

unsdorf, O.,

and Sonnenschein, M. (2014). Conjoint Dynamic Ag-

gregation and Scheduling Methods for Dynamic Vir-

tual Power Plants. In Ganzha, M., Maciaszek, L. A.,

and Paprzycki, M., editors, Proceedings of the 2014

Federated Conference on Computer Science and In-

formation Systems, volume 2 of Annals of Computer

Science and Information Systems, pages 1505–1514.

IEEE.

Nieße, A., Bremer, J., Hinrichs, C., and Sonnenschein, M.

(2016). Local Soft Constraints in Distributed En-

ergy Scheduling. In Proceedings of the 2016 Feder-

ated Conference on Computer Science and Informa-

tion Systems (FEDCSIS), pages 1517–1525. IEEE.

Ortega, J. M. and Rheinboldt, W. C. (1970). Iterative solu-

tion of nonlinear equations in several variables.

Rahnamayan, S., Tizhoosh, H. R., and Salama, M. M.

(2007). A novel population initialization method for

accelerating evolutionary algorithms. Computers &

Mathematics with Applications, 53(10):1605 – 1614.

Rechenberg, I. (1965). Cybernetic solution path of an exper-

imental problem. Technical report, Royal Air Force

Establishment.

Rigling, B. D. and Moore, F. W. (1999). Exploitation of

sub-populations in evolution strategies for improved

numerical optimization. Ann Arbor, 1001:48105.

Rosenbrock, H. H. (1960). An Automatic Method for Find-

ing the Greatest or Least Value of a Function. The

Computer Journal, 3(3):175–184.

Rudolph, G. (2001). Self-adaptive mutations may lead to

premature convergence. IEEE Transactions on Evolu-

tionary Computation, 5(4):410–414.

Shang, Y.-W. and Qiu, Y.-H. (2006). A note on the extended

rosenbrock function. Evol. Comput., 14(1):119–126.

Simon, D. (2013). Evolutionary Optimization Algorithms.

Wiley.

Sobieszczanski-Sobieski, J. and Haftka, R. T. (1997). Mul-

tidisciplinary aerospace design optimization: survey

of recent developments. Structural optimization,

14(1):1–23.

Sonnenschein, M., L

unsdorf, O., Bremer, J., and Tr

oschel,

M. (2014). Decentralized control of units in smart

grids for the support of renewable energy supply. En-

vironmental Impact Assessment Review, (0):–. in

press.

Storn, R. and Price, K. (1997). Differential evolution – a

simple and efﬁcient heuristic for global optimization

over continuous spaces. Journal of Global Optimiza-

tion, 11(4):341–359.

Talbi, E. (2009). Metaheuristics: From Design to Imple-

mentation. Wiley Series on Parallel and Distributed

Computing. Wiley.

Trelea, I. C. (2003). The particle swarm optimization algo-

rithm: convergence analysis and parameter selection.

Information Processing Letters, 85(6):317 – 325.

Tuy, H., Thieu, T., and Thai, N. (1985). A conical algo-

rithm for globally minimizing a concave function over

a closed convex set. Math. Oper. Res., 10(3):498–514.

Ulmer, H., Streichert, F., and Zell, A. (2003). Evolu-

tion strategies assisted by gaussian processes with im-

proved pre-selection criterion. In in IEEE Congress

on Evolutionary Computation,CEC 2003, pages 692–

699.

Vanneschi, L., Codecasa, D., and Mauri, G. (2011). A

comparative study of four parallel and distributed pso

methods. New Generation Computing, 29(2):129–

161.

Watts, D. and Strogatz, S. (1998). Collective dynamics of

’small-world’ networks. Nature, (393):440–442.

Weise, T., Chiong, R., and Tang, K. (2012). Evolution-

ary optimization: Pitfalls and booby traps. Journal

of Computer Science and Technology, 27(5):907–936.

Wright, S. J. (2015). Coordinate descent algorithms. Math-

ematical Programming, 151(1):3–34.

Lazy Agents for Large Scale Global Optimization