DEALING WITH COALITION FORMATION IN THE RoboCup
R
ESCUE
An Heuristic Approach
Daniel Epstein and Ana L. C. Bazzan
PPGC, UFRGS, Caixa Postal 15064, CEP 91501-970, Porto Alegre, RS, Brazil
Keywords:
Robocup rescue, Coalitions, Multiagent systems.
Abstract:
Finding an optimal coalition structure to divide agents in groups is equivalent to the set-partitioning problem.
Several algorithms have been proposed. However, even to find a sub-optimal value they have to search within
an exponential number of coalition structures. Therefore, in this paper, we use one of the proposed algorithms,
which is anytime and hence suits environments such as the RoboCup Rescue, where a response is needed in
a short time frame. Moreover, we propose to combine this algorithm with heuristics to reduce and constraint
the number of agents and tasks that are allowed to participate in a coalition. In this paper we discuss the
application of such an approach in a complex task allocation scenario, the RoboCup Rescue.
1 INTRODUCTION
In multiagent encounters, a coalition can be defined
as a group of agents which decide to cooperate in or-
der to achieve a common goal. A coalitional game
(CG) is a model of interacting decision-makers that
focuses on the behavior of groups of individuals. An
outcome of a CG is a partition of the set of all players
into coalitions, together with an action for each coali-
tion. Unfortunately, this problem is equivalent to the
set partitioning one.
Generating the coalition structure and finding the
optimal one is a hard problem. The first algorithm to
establish a bound within a minimal amount of search
was given in (Sandholm et al., 1999). However, de-
spite the fact that the bound can be established in lin-
ear time in the size of the input, the bad news remain
that 2
a1
(where a is the number of agents) nodes of
the coalition structure graph have to be searched in
order to guarantee the worst case bound from opti-
mum. This prevents the use of that kind of algorithm
in multi-agent systems with a high number of agents.
Sandholm and colleagues themselves showed that it
is possible to lower the bound with further and/or
smarter search (Sandholm et al., 1999). For example,
Rahwan and colleagues (Rahwan et al., 2007) have
proposed a near-optimal anytime algorithm for coali-
tion structure generation, which partitions the space
in terms of coalitions of particular sizes.
Here, we approach the problem taking advantage
of the fact that particular CG’s have coalitions and
components that are meaningless in a given scenario.
With this we hope to decrease the amount of search.
2 BACKGROUND
Due to space limitation we cannot fully describe the
RoboCup Rescue simulator. Interested readers are
referred to (Kitano et al., 1999; Skinner and Barley,
2006) for more details.
Currently the simulator tries to reproduce condi-
tions that arise after the occurrence of an earthquake
in an urban area, such as the collapsing of buildings,
road blockages, fire spreading, buried and/or injured
civilians. In the RoboCup Rescue simulator the main
agents are fire brigades, police forces, and ambulance
teams. These have limited perception of their sur-
roundings; can communicate, but are limited on the
number and size of messages they can exchange.
Information available to the agents contains at-
tributes related to buildings, civilians, and blockages
for instance. Regarding the civilians these attributes
are the location, health degree (called HP), damage
(how much the HP of those civilians dwindles every
time step), and buriedness (a measure of how difficult
it is to rescue those particular civilians). The attributes
related to buildings are location, whether or not it is
burning, how damaged it is, and the type of mate-
717
Epstein D. and L. C. Bazzan A..
DEALING WITH COALITION FORMATION IN THE ROBOCUP RESCUE - An Heuristic Approach.
DOI: 10.5220/0003294607170720
In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence (ICAART-2011), pages 717-720
ISBN: 978-989-8425-40-9
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
rial used in the construction. Finally the attributes of
blockages are location in the street, cost to unblock,
and their size.
To measure the performance of the agents, the res-
cue simulator defines a score which is computed, at
the end of the simulation, as shown in Equation 1,
where P is the quantity of civilians alive, H is a mea-
sure of their health condition, H
init
is a measure of the
health condition of all civilians at the beginning of the
simulation, B is the building area left undamaged, and
B
max
is the initial building area.
Score =
P+
H
H
inicial
r
B
B
inicial
(1)
Regarding related work, the reader is referred to
(Ferreira et al., 2010) where an extensive revision is
made. Here we only mention works that are used to
compare our results, as well as previous works for
generating coalition structures. In the particular sce-
nario of the RoboCup Rescue, agents may form coali-
tions to solve tasks posed by the environment. We
start with the latter.
The generation of all possible coalition structures
is exponential in the number of agents. This is an im-
portant issue because it was demonstrated that finding
the optimal coalition structure is NP-complete (Sand-
holm et al., 1999).
Another form of task allocation in the RoboCup
Rescue, i.e. non coalition-based, is by means of
task assignment. The generalized assignment prob-
lem (GAP) is one possible model used to formalize a
task allocation problem. It deals with the assignment
of tasks to agents, constrained to agents’ available re-
sources, and aims at maximizing the total reward.
The GAP model was extended by Scerri et al
(Scerri et al., 2005) to incorporate two features: sce-
nario dynamics and inter-task constraints. This ex-
tended model was called extended generalized assign-
ment problem (E-GAP).
An E-GAP can be converted to a distributed con-
straint optimization problem (DCOP) taking agents as
variables and tasks as the domain of values. However,
this modeling leads to dense constraint graphs. In or-
der to deal with these issues, Scerri et al. present an
approximate algorithm called Low-communication
Approximate DCOP (LA-DCOP) (Scerri et al., 2005)
that uses a token-based protocol; agents perceive a
task in the environment and create a token to repre-
sent it, or they receive a token from another agent. An
agent decides whether or not to perform a task based
both on its capability and on a threshold.
If an agent in LA-DCOP is able to perform more
than one task, then it must select those that maximize
its capability given its resources. This maximization
problem can be reduced to a binary knapsack prob-
lem (BKP), which suggests that the complexity of
LA-DCOP depends on the complexity of the function
implemented to deal with the BKP.
Swarm-GAP (Ferreira et al., 2010) resembles LA-
DCOP in the sense that it is also GAP-based, is ap-
proximate, uses tokens for communication, and deals
with extreme teams. An agent in Swarm-GAP decides
whether or not to perform a task based on the model
of division of labor used by social insects colonies,
which has low communication and computation ef-
fort. This avoids the complexity of the maximization
function used in LA-DCOP. A key parameter in the
approach is the stimulus s agents have towards tasks,
as it is responsible for the selectiveness of the agent
regarding perceived tasks.
In order to deal with inter-task constraints, agents
in Swarm-GAP just increase the tendency to perform
related tasks by a factor called execution coefficient
(details in (Ferreira et al., 2010)).
3 HEURISTICS FOR COALITION
FORMATION IN THE RESCUE
In the RoboCup Rescue the number of coalition struc-
tures is affected not only by the number of agents but
also by the number of tasks. Low-priority tasks are
not considered in the set of coalition structures. From
the point of view of the agent, some tasks may also be
discarded because the agent either has no resources to
perform it, or is located far away from it. This way
agents that cannot be allocated to important tasks are
removed from the set of agents to form coalitions, fur-
ther reducing the number of possible coalition struc-
tures. After we find a reduced number of agents and
tasks, the search of the actual optimal coalition struc-
ture is performed by the anytime algorithm proposed
by (Rahwan et al., 2007). In what follows we analyze
the characteristics of tasks and agents as well as how
they influence the quality of the simulation.
In order to rank tasks that are related to fire spots,
one must examine how the final score (Eq. 1) is af-
fected by such tasks. Buildings that have larger areas
are more important than those with small ones. Fur-
ther, the type of building (regarding construction ma-
terial) is an important factor. Finally, the area of the
neighboring buildings must equally be considered in
order to try to prevent the propagation of fire.
To appropriately rank tasks related to rescuing
civilians, one must consider their HP, their location,
and how difficult is the rescue. The heuristics that are
considered here are as follows. Civilians with higher
priority are those that have a high HP but that can still
ICAART 2011 - 3rd International Conference on Agents and Artificial Intelligence
718
be saved i.e. the time to arrive at its location and res-
c
ue it (considering how buried it is) is enough. Coali-
tions are important here because the presence of more
than one ambulance is key to rescue a civilian faster
(one civilian can be saved in 4 time units by one am-
bulance or in 2 time units by 2 ambulances), up to a
limit. As mentioned, an ambulance must be able to
arrive to the location of a civilian within a time frame
that is compatible with its rescue. Ambulances that do
not fulfill such constraints must be excluded from the
corresponding coalitions, thus reducing the number of
coalition structures.
Contrarily to the two other kinds of tasks, remov-
ing blockages does not have a direct impact in the
score. However, blockages hinder the traffic of the
ambulances and firefighters thus they have a great in-
direct impact. The attributes of this kind of task are
its type (arterial, secondary, and so on), number of
blocked lanes, and how difficult is the removal. Re-
garding the former, priority will be given to arterials
and/or roads with many blocked lanes. Roads that are
only partially blocked have low priority because traf-
fic is still possible there. Regarding the last attribute,
priority is given to roads with less blockages so that
the number of roads where traffic is at least partially
possible is maximized.
The main heuristic to consider a police force or
not in a coalition is whether or not it is located very
close to the blockage. If it is not within three time
units from location of a task, a police is not consid-
ered to deal with this task. This number was experi-
mentally selected.
For ambulances one has to consider the trip time
between its current location and the location of a civil-
ian. Civilians that are not reachable in a time frame
that is compatible with the time to unbury and res-
cue them are not considered. This time is called here
c and is the sum of the time to arrive at the civilian
location plus the remaining life time of the civilian,
computed based on its HP and level of buriedness.
Teams of ambulances are not considered here for sake
of including or not an ambulance in a coalition but of
course the value of coalitions with different number
of ambulances is different as mentioned before.
Fire fighters also have distance as an important at-
tribute i.e. the agent must be able to reach the task
location in time to be valuable. However, another one
is equally important: the level of water each has. Only
tasks for which the agent has a good ratio water to dis-
tance will be considered. Therefore agents with low
level of water must consider only nearby tasks. Since
on average a building takes 5 time steps to lose half of
its value, no agent that is more than 5 time steps away
of the building will be considered.
We are now in position to formalize the heuristics
related to the agents regarding the tasks.
Heuristic for Firefighters
Let A
j
be the area of building j, n
j
the number of
adjacent buildings to j, A
k
j
the area of the k-th build-
ing around j, and F
j
the influence of A
j
in the score
(Eq. 1) i.e. how destroyed is building j. The value
of a task (building) j (irrespective of any agent i) is
given by V
j
= A
j
× F
j
+
n
j
k=1
(A
k
j
×F
k
j
)
A
j
.
If we also in-
clude D
ij
, the distance between i and j in the value
of task j, now for each agent i, this equation turns:
V
i
j
=
A
j
× F
j
+
n
j
k=1
(A
k
j
×F
k
j
)
A
j
×
5
2×D
i
j
.
Heuristic for Police Forces
Similarly to firefighters, here we quantify the impor-
tance of tasks, this time for police forces. As men-
tioned, the main idea is to free the highest possible
number of roads, even if partially. Relevant attributes
here are: the number of lanes in a road, P
t
; the num-
ber of free lanes, P
l
; the cost to unblock the lane,
C
b
. Thus, the value of task (removal of blockage j)
V
i
j
, already considering the distance D
ij
to agent i is:
V
i
j
=
P
t
j
2× P
l
j
C
b
j
+
1
P
t
j
×
3
2×D
i
j
.
Heuristic for Ambulances
To compute the value of rescue tasks, we must com-
pute the life expectancy of each civilian, E
j
. Let HP
j
be the hit points of a civilian j (roughly a measure of
how alive it is), and B
j
a measure of how buried j is.
Given that B
j
is reduced by 200 at each time step, we
can compute E
j
: E
j
= HP
j
B
j
× 200 The value of j
for agent i, E
i
j
is: E
i
j
= HP
j
B
j
× 200 D
ij
× 200.
4 EXPERIMENTS AND RESULTS
We have run our experiments in two maps that are
largely used by the community around RoboCup Res-
cue, namely Kobe and Kobe4.Due to limitation of
space we restrict the discussion of results to the first
map. We have used version 0.49.9 of the simulator.
We remark that the version 0.50 has several bugs.
Also, there is a completely new version of the sim-
ulator, which was not used by us so far.
DEALING WITH COALITION FORMATION IN THE RoboCup RESCUE - An Heuristic Approach
719
Table 1: Scores for non-coalition-based approaches.
LA-DCOP Swarm-Gap Greedy
49.69 ± 6.31 44.97 ± 1.76 43.78 ± 7.19
Table 2: Scores for the coalition-based approach, for vari-
ous values of ρ.
ρ = 0.3 ρ = 0.5 ρ = 0.7
70.28 ± 6.94 67.63 ± 9.68 65.34 ± 7.22
In this map there are 6 ambulance teams, 10 fire
brigades, and 8 police forces. There are also 72 civil-
ians, 734 buildings, and 820 roads. The dynamics of
the rescue scenario means that the number and type of
tasks change, which is a problem for the grouping of
the agents and consequent re-computation of the pos-
sible coalitions. These must be re-grouped from time
to time or event-based as also done in (Santos and
Bazzan, 2010). We have tried both approaches but the
former does not perform well because different tasks
have different execution times. Therefore we only
discuss the latter. For the event-based re-computation,
such an event is the reaching of a certain rate ρ of
ungrouped agents i.e. agents that have finished per-
forming their previous assigned tasks and that are se-
lecting tasks in a greedy way or not at all. Tested
values were ρ {0.3, 0.5, 0.7} i.e. new coalitions are
formed when 30%, 50%, or 70% of the agents of a
given type are no longer in coalitions (because their
assigned tasks are over).
In order to compare the results with other ap-
proaches that are not based on coalition formation,
we use LA-DCOP, Swarm-GAP, and a greedy strat-
egy. The latter is equivalente to the so-called sam-
ple agents. We have performed 20 repetitions of each
simulation. For LA-DCOP, the threshold used was
T = 0.2, while Swarm-GAP was tested with stimulus
s = 0.1. These values were selected after calibration
in (Ferreira et al., 2010). Results appear in Table 1
where we give the scores (Eq. 1) at the end of the
simulation. The same setting was then used to test
the coalition-based approach, for various values of ρ.
Results appear in Table 2.
It is possible to see that the use of coalitions rep-
resents an increase in performance. The best scores
are achieved if the re-grouping of agents (i.e. the re-
evaluation of the coalition formation) is done when at
most 30% of the agents have finished their tasks.
REFERENCES
Ferreira, Jr., P. R., dos Santos, F., Bazzan, A. L. C., Epstein,
D., and Waskow, S. J. (2010). Robocup rescue as
multiagent task allocation among teams: experiments
with task interdependencies. Journal of Autonomous
Agents and Multiagent Systems, 20(3):421–443.
Kitano, H., Tadokoro, S., Noda, I., Matsubara, H., Taka-
hashi, T., Shinjou, A., and Shimada, S. (1999).
Robocup rescue: search and rescue in large-scale dis-
asters as adomain for autonomous agents research.
In Proceedings of the IEEE International Conference
on Systems, Man, and Cybernetics (SMC), volume 6,
pages 739–743, Tokyo, Japan. IEEE.
Rahwan, T., Ramchurn, S. D., Dang, V. D., and Jennings,
N. R. (2007). Near-optimal anytime coalition struc-
ture generation. In Proc. of the Int. Joint Conf. on Art.
Intelligence (IJCAI 07), pages 2365–2371. available
at http://ijcai.org/proceedings07.php.
Sandholm, T., Larson, K., Andersson, M., Shehory, O., and
Tohm´e, F. (1999). Coalition structure generation with
worst case guarantees. Artificial Intelligence, 111(1–
2):209–238.
Santos, D. S. d. and Bazzan, A. L. C. (2010). Distributed
clustering for group formation and task allocation.
In van der Hoek, W., Kaminka, G. A., Lesp´erance,
Y., Luck, M., and Sen, S., editors, Proceedings of
the Ninth International Conference on Autonomous
Agents and Multi-Agent Systems, pages 1429–1430,
Toronto. IFAAMAS: Internatioal Foundation for Au-
tonomous Agents and Multiagent Systems.
Scerri, P., Farinelli, A., Okamoto, S., and Tambe, M. (2005).
Allocating tasks in extreme teams. In Dignum, F.,
Dignum, V., Koenig, S., Kraus, S., Singh, M. P., and
Wooldridge, M., editors, Proc. of the Fourth Interna-
tional Joint Conference on Autonomous Agents and
Multiagent Systems, pages 727–734, New York, USA.
ACM Press.
Skinner, C. and Barley, M. (2006). Robocup rescue simula-
tion competition: Status report. In Bredenfeld, A., Ja-
coff, A., Noda, I., and Takahashi, Y., editors, RoboCup
2005: Robot Soccer World Cup IX, volume 4020 of
Lecture Notes in Computer Science, pages 632–639.
Springer-Verlag, Berlin.
ICAART 2011 - 3rd International Conference on Agents and Artificial Intelligence
720