Effective Area Partitioning in a Multi-Agent Patrolling Domain

for Better Efﬁciency

Katsuya Hattori and Toshiharu Sugawara

Department of Computer Science and Communications Engineering, Waseda University, Shinjuku, Tokyo 1698555, Japan

Keywords:

Multi-Agent System, Patrolling Problem, Division of Labor, Negotiation, Cooperative Agent.

Abstract:

This study proposes a cooperative method for a multi-agent continuous cooperative patrolling problem by par-

titioning the environment into a number of subareas so that the workload is balanced among multiple agents

by allocating subareas to individual agents. Owing to the advancement in robotics and information technology

over the years, robots are being utilized in many applications. As environments are usually vast and compli-

cated, a single robot (agent) cannot supervise the entire work. Thus, cooperative work by multiple agents,

even though complicated, is indispensable. This study focuses on cooperation in a bottom-up manner by fairly

partitioning the environment into subareas, and employing each agent to work on them as its responsibility.

However, as the agents do not monitor the entire environment, the decentralized control may generate unrea-

sonable shapes of subareas; the area are often unnecessarily divided into fragmented enclaves, resulting in

inefﬁciency. Our proposed method reduced the number of small and isolated enclaves by negotiation. Our

experimental results indicated that our method eliminated the minute/unnecessary fragmented enclaves and

improved performance when compared with the results obtained by conventional methods.

1 INTRODUCTION

In recent years, robotic applications have attracted

attention in many ﬁelds due to the development of

advanced hardware, such as high-functional sensors

and actuators related to robot technology and infor-

mation technology. Robots are particularly required

to play an active role in ﬁelds that entail repetitive

tasks or operations in inaccessible areas. However,

if the workspace is vast and/or complicated and re-

quires various abilities, it is not realistic to work only

with a single robot owing to physical and performance

limits, such as battery capacity, movement speed, and

limited work capability. The advancement in mobile

wireless communication technology enables efﬁcient

real-time communication among robots and coordina-

tion and cooperation among multiple robots.

To control the collaborative activities of multiple

agents, which are the abstraction of robots in a gen-

eral framework, we consider the multi-agent contin-

uous cooperative patrolling problem. Possible appli-

cations of this problem are area cleaning and secu-

rity/surveillance patrolling by multiple agents. In this

problem, agents are required to divide the given task

https://orcid.org/0000-0002-9271-4507

so that the burden on each agent is as fair as possi-

ble; this also improves overall efﬁciency and results

in uniform quality of task outputs. To achieve fair

and effective division of labor, we consider a method

of explicitly partitioning a working responsible area

(RA) to a number of smaller areas and assigning

each agent to the partitioned area through communi-

cation/negotiation between agents. One of the difﬁ-

culties in fair partition of areas is that simple parti-

tion into equal sizes may not be appropriate because,

for example, (1) some partitioned area/room is dis-

tant from the charging/storage locations of agents, (2)

some rooms are more important than others so agents

have to visit them more frequently than others, (3)

some areas contain obstacles/slopes that makes pa-

trolling inefﬁcient, and (4) the shape of a partitioned

area is complicated so it takes longer time to cover

the area. Nevertheless, agents have to consider these

factors to fairly partition the working area in a decen-

tralized manner.

Several studies have attempted to achieve collab-

orative work by dividing work areas (Kato and Sug-

awara, 2013; Ahmadi and Stone, 2006; Elor and

Bruckstein, 2009; Nasir et al., 2016). One disadvan-

tage of such distributed methods is that there is no

agent that manages the entire area; hence, the shapes

Hattori, K. and Sugawara, T.

Effective Area Partitioning in a Multi-Agent Patrolling Domain for Better Efﬁciency.

DOI: 10.5220/0010241102810288

In Proceedings of the 13th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2021) - Volume 1, pages 281-288

ISBN: 978-989-758-484-8

281

of the divided areas are left unattended. For exam-

ple, agents proposed in Kato and Sugawara (Kato

and Sugawara, 2013) may generate split (and often

chopped) RAs due to the local decision. This kind of

inefﬁcient division leads to a decrease in overall per-

formance. Conversely, agents often have to generate

disconnected RAs. For example, if rooms have nar-

row entrance doors or are led by narrow passages that

must be allocated to only one agent (an actual exam-

ple is shown later), the splitting of RAs is mandatory.

Therefore, in such an environment, if we introduce

a constraint that each RA must be connected, the di-

vided working areas cannot be balanced.

Therefore, we propose a method for partition-

ing the entire area into a number of RAs for indi-

vidual agents to obtain a balanced workload with-

out unnecessarily splitting the areas. We experimen-

tally demonstrated that our proposed method exhib-

ited better performance than that by the conventional

method (Kato and Sugawara, 2013). We also inves-

tigated the performance when the environment had

many obstacles and when the environment character-

istics are not uniform and show that in all such cases,

our method outperformed the conventional method.

2 RELATED WORK

There are a number of studies on multi-agent contin-

uous patrolling problems (Huang et al., 2019). We

can classify them roughly into two based on the meth-

ods for solving them. In the ﬁrst type of method, the

agents share the entire area to move around without

dividing the area into smaller subareas. Carrillo and

Rapp (Carrillo and Rapp, 2020) proposed a method

to identify patrolling policies for multiple agents with

limited visibility regions and non-deterministic pa-

trolling paths. Yoneda et al. (Yoneda et al., 2013)

propose a method in which agents individually de-

termine their exploration algorithms using reinforce-

ment learning to contribute more toward the shared

goal. They also assumed that agents’ intermittent be-

haviors depend on the battery charge and agents de-

cide to explore the environment further or return to

their charging base depending on their battery capac-

ity. Our study also assumes the charging base, and

that agents have to return to the base before run-out.

Sugiyama et al. (Sugiyama et al., 2019), who also

used the model of cyclic charging activities, consider

the cycle of agent patrolling while shifting the time

phase to visit each location. However, because en-

vironments are not partitioned, there is a possibility

that more than two agents patrol the same area, which

may be redundant and hence, unnecessary actions in-

crease.

In the second type of method for solving continu-

ous patrolling problems, the environment is fairly di-

vided into a number of subareas and each of them is

allocated to an agent to move around in a balanced

manner. In the method proposed by Nasir et al. (Nasir

et al., 2016), the leader agent divides the environment

and determines and allocates the exploring area to in-

dividual member agents. However, this requires cen-

tralized control by the leader agent, and hence, its fail-

ure affects the entire system. Ahmadi and Stone (Ah-

madi and Stone, 2006) introduced area division based

on boundary relationships between agents. If there

were overlapping locations, they were transferred to

the agents which frequently visited those areas. Elor

and Bruckstein (Elor and Bruckstein, 2009) proposed

a model based on balloon expansion so that individual

agents can fairly divide the environment into subareas

of the same size in a bottom-up manner. However,

given the obstacles and non-uniform structures in the

environment, divisions of the same size are not always

fair from the viewpoint of agents’ workload. More-

over, these methods did not consider the constraints

due to battery capacity.

By contrast, Kato et al. (Kato and Sugawara,

2013) introduced the constraint on the battery ca-

pacity and proposed the partitioning method for fair

workload among agents. Their method for area par-

titioning is based on the expansion power, like the

current study, that reﬂects the degree to which the

agent has completed the work in its RA. However,

because their method generated many fragments of

RAs (Kato and Sugawara, 2013), the performance

often decreased or could not be applied in a compli-

cated environment. In this study, agents divide their

RAs according to the shape of the environment so

that agents with the method can be applied to even

complex environments. We also reduce the unneces-

sary fragments of the area of responsibility to improve

the efﬁciency of event collection/observation in multi-

agent patrolling problem.

3 BACKGROUND AND PROBLEM

3.1 Environment

We introduce discrete time t ≥ 0, whose unit is a

step. The multi-agent continuous cooperative pa-

trolling problem can be expressed by (G,A,P ), where

A = {1,·· · ,N} is a set of agents, G = (V, E) is a con-

nected graph embeddable into two-dimensional Eu-

clidean space, V = {v

,· ·· ,v

} is the set of nodes,

E is the set of edges e

connecting two nodes

ICAART 2021 - 13th International Conference on Agents and Artiﬁcial Intelligence

282

∈ V , and P = {p

}

v∈V

(0 ≤ p

≤ 1) is the distri-

bution of event probability, i.e., a collection of proba-

bilities of event occurrence at v ∈ V . An event at node

v is required to be observed or monitored by one of the

agents, and thus, agents are necessary to visit these

nodes in proportion to the values of these probabili-

ties. For example, nodes where high security levels

are required has higher event probabilities in a secu-

rity patrolling domain and nodes with higher proba-

bilities are more likely to get dirty in a vacuum clean-

ing domain. Therefore, we assume that P is given to

all agents.

We assume that events are accumulated if no

agents have monitored/observed them. Therefore, for

∀v ∈ V , the amount of accumulated events L

(v), is

incremented by one with p

at t, that is,

t+1

(v) =



(v) + 1 with probability p

(v) otherwise

(1)

If the agent visits node v at time t, it has observed or

monitored v, and so the amount of accumulated event

at v is cleared (L

t+1

(v) = 0). However, agents cannot

know the actual values of L

(v); hence, they estimate

(v) using the expected value E(L

(v)) using p

∈ P .

We can assume that the length of an edge is 1 by

adding dummy nodes whose event probabilities are

zero. Therefore, agents at any node can move to its

adjacent node in one step. Agent i has a charging base

base

(∈ V ) that is speciﬁed initially and that it has to

return to before run-out. Agent i has its own RA V

at time t, which is a subset of V including v

base

, and

where it must move around to maintain. For a fair

workload, i has to adjust its own RA by negotiation

with its collaborative agents.

3.2 Model of Agent

We introduced three assumptions in the model of

agents. First, agents have the map where they patrol.

We believe that in many applications, the map of the

environment is often known; hence, this assumption

is plausible. Furthermore, many map creation algo-

rithms have been proposed so far. As we focus on

the fair workload by appropriate area partitioning, we

assume that maps can be made by using these algo-

rithms if necessary. Second, multiple agents can exist

in the same node and collision is not possible. In re-

ality, a collision may occur when multiple agents try

to move to the same node if it is narrow. However

again, because we focus on fair workload, we assume

that collision avoidance algorithms will be used for

actual applications. Finally, each agent has a battery

with a ﬁnite capacity. Therefore, it cannot continue

patrolling forever, and alternately repeats the explor-

ing state and the charging state.

Agents have two states: the active state and charg-

ing state. Agents in the active state move around the

environment (or their own RA) to ﬁnd unobserved

events in accordance with their exploring strategies,

whereas agents in the charging state charge at their

charging base. The states of the battery in agent i is

speciﬁed by parameters B

max

drain

, where B

max

is the maximal charge capacity, B

drain

is the battery

consumption per step when i is moving, and B

is the

remaining battery capacity. Therefore, the charging

time t

to make the battery full is proportional to the

consumed energy; hence, it is

= k

max

− B

), (2)

> 0 is the charging constant. For simplicity,

agents start from their base after full charge.

For ∀v ∈ V , we deﬁne the potential Pot

(v) of v as

the required amount of energy to reach the charging

base. Let Len(v,u) be the shortest distance from node

v to node u, i.e., the number of edges in the shortest

path between them. It can then be deﬁned by

Pot

(v) = Len(v, v

base

) × B

drain

(3)

When i attempts to move from node v

to node v

, it

compares B

and Pot

) and if

< Pot

) + Len(v

) × B

drain

(4)

is satisﬁed, i gives up on v

and returns to its charging

base v

base

to prevent battery run-out.

The purpose of agents is to visit nodes in their

RA V

as frequently as possible by considering the

event probability of nodes using a certain exploring

algorithm. In this paper, because we focus on the

method of area partitioning, we used a simple al-

gorithm for exploring, the directed depth-ﬁrst search

(DDFE) (Kato and Sugawara, 2013), which is brieﬂy

described as follows.

When agent i leaves the charging base at t, it

sets the node v

which has the largest expected value

E(L

(v)) in V

and moves to v

along the shortest

route. When i reaches v

, v

is pushed into the in-

ternal stack and its unvisited adjacent nodes in V

are

added to the open list. Then, i selects one of them

randomly and moves there. Agent i pushes the cur-

rent node into the stack and the adjacent nodes that

are unvisited and are not in the open list are added

to the open list. This is repeated as long as i can se-

lect unvisited adjacent nodes. When i cannot select

an unvisited node, it pops the top node from the stack

and moves back to that node, and then i selects an un-

visited node from the open list. After repeating this

operation if i returns to v

and cannot select another

unvisited node, it returns to the charging base along

the shortest path. Note that when it moves along the

Effective Area Partitioning in a Multi-Agent Patrolling Domain for Better Efﬁciency

283

shortest path, all nodes in this path may not be in V

;

therefore, i may pass through the areas that are the

responsibility of other agents.

3.3 Deciding RA

We will brieﬂy explain how agents coordinate with

each other to decide their RAs in the environment us-

ing the conventional method. Please see (Kato and

Sugawara, 2013) for details. Initially, agent i ∈ I has

a small area within the distance d

init

from its charging

base v

base

as an initial RA V

= {v ∈ V |Len(v,v

base

) ≤

init

}. We assume that v

base

∈ V

for ∀t.

First, agent i calculates the sum of all expected

values

E(L(V

)) =

∑

v∈V

E(L(v)).

This value also indirectly shows how completely an

agent has patrolled its area of responsibility. The re-

ciprocal of this value, ε(i,t), is called the expansion

power of i at time t. Agents calculate the expansion

power whenever they return to the charging base, and

retain the values until the next change in calculation.

When agents expand their RAs, they attempt to in-

clude some boundary nodes {u

,. .. ,u

} of V

(where

6∈ V

) and nodes that are not too far from the base

base

. Then, if u

is not included, in other RAs, u

added to V

. If u

∈ V

for j ∈ I (i 6= j), they begin

to negotiate and the agent whose expansion power is

larger includes the nodes into their RA.

3.4 Isolated Enclaves

One signiﬁcant drawback of the conventional method

is that agents will have their RAs unnecessarily split

by others’ expansion behaviors; there will be many

isolated enclaves. We deﬁne an enclave as follows.

For the current RA V

, let us consider G

= (V

)

where E

= {e

∈ E|u,v ∈ V

}. Then, for v ∈ V

the connected nodes V

conn

(v) of i is the set of nodes

in V

reachable from v only along the edges in E

Similarly, we can deﬁne the connected component

conn

(v) = (V

conn

(v),E

conn

(v)), where E

conn

(v) =

∈ E|u,v ∈ V

conn

(v)}. Let G

conn

be the set of all

connected components of i. Then, G

conn

(v) is an en-

clave if it does not include i’s base (v

base

6∈ V

conn

(v)).

Obviously, connected components are exclusive and

their union is equal to V

. As mentioned before, some

enclaves are indispensable to cover the entire environ-

ment, but unnecessary enclaves, especially small and

scattered enclaves, will signiﬁcantly reduce the sys-

tem’s performance.

3.5 System Evaluation Criteria

The purpose of this research is to allow agents to pa-

trol the environment and more frequently visit impor-

tant nodes by appropriately partitioning the environ-

ment into RAs, so that their workload is fair and bal-

anced. Because our target applications are, for ex-

ample, patrolling for security surveillance, large area

cleaning, and environmental/sensor data collection,

agents should visit the nodes in their RA without leav-

ing any nodes unattended for a long time. For this pur-

pose, we evaluate the systems performance using the

average value D

(V ) of the number of remaining

events of all nodes in period [t

], which is deﬁned

(V ) =

∑

t=t

(V )/(t

−t

+ 1). (5)

Therefore, the smaller the value of D

(V ), the better

is the performance of the method.

The target of the proposed method is efﬁcient pa-

trolling by reducing the number of enclaves, and so,

we also investigate the number of enclaves of their

RAs. Note that t

and t

in D

(V ) is often omitted

(so D(V )).

4 PROPOSED METHOD

We propose a method for multiple autonomous agents

to individually decide their own RA by partitioning

the environment without small isolated enclaves. This

method consists of the determination of expansion

nodes, which is almost identical to the conventional

method (Kato and Sugawara, 2013), and two negoti-

ation phases for the arrangement of overlapped area

assignments to maintain a balanced workload and del-

egate isolated enclaves to reduce small and isolated

areas.

4.1 Determination of Expansion Nodes

Agent i ∈ A attempts to gradually expand the RA by

including a number of nodes that are not in but at

the boundaries of V

, when it has almost ﬁnished the

work in the RA and still has enough battery capacity

to move more. First, when i leaves the charging base

base

at the time t

, it calculates the estimated value

E(L

+γ

)) of the remaining amount of the event

after γ (> 0) steps from t

. The reason for calculating

the estimated number of unobserved events at time

+ γ instead of at t

is that (1) i tries to expand the

RA at a certain time after leaving the base and (2) it

ICAART 2021 - 13th International Conference on Agents and Artiﬁcial Intelligence

284

wants to compare the state of unobserved events if it

does not move after it works actually.

After agent i starts from v

base

at t

, it records

vis

(t), which is the number of nodes visited by t

(> t

), and N

(t), which is the sum of the observed

events. Then, when conditions (6) and (7) are satis-

ﬁed, the agent tries to expand the RA.

vis

(t) ≥ R

· |V

| (6)

(t) ≥ R

· E(L

+γ

)), (7)

where R

and R

are the parameters to adjust expan-

sion activity (0 ≤ R

≤ 1). To avoid excessive area

expansion, agents attempt to expand the RA only once

after leaving their bases.

When agent i tries to expand its own area, it deter-

mines the set of nodes to be included I

(1) Agent i selects a set of nodes B that is adjacent to

its own RA.

(2) i selects k

inc

(> 0) nodes from B that are not in-

cluded in I

exp

(explained later) and have the short-

est distance from its charging base v

base

, and sets

them as I

inc

, where k

inc

is an integer.

(3) Nodes in I

inc

and nodes adjacent to an element in

inc

\ I

exp

are deﬁned as the set I

Thus, if I

= ∅, i does not expand the area.

4.2 Arrangement of Overlapped Areas

After determining the expansion nodes, agents decide

which agent should work for individual nodes in I

For every v ∈ I

, when v ∈ V

, i requests ε( j,t), from

j and i includes v to V

t+1

only when ε( j,t) < ε(i,t).

If ε( j,t) ≥ ε(i,t), then v is added to I

exp

is the set of nodes that should not be included

in I

for a while to prevent frequent challenges and

failures. Hence, once a node is included in I

exp

, it

will be excluded from I

exp

after the area expansion is

performed k

avoid

(> 0) times. Furthermore, in Step (2)

of the expansion node determination, i adds all nodes

that are not included in other’s RA and are omitted

from counting k

avoid

nodes.

4.3 Reducing Isolated Small Enclaves

To reduce the occurrence of unnecessary enclaves,

agent i negotiates with other agents to delegate iso-

lated enclaves to more appropriate agents. To decide

an enclave that should be delegated to other agents, i

will select small enclaves distant from the base.

For ∀G

conn,k

∈ G

conn

, we calculate the mean dis-

tance from the base v

base

dis(G

conn,k

) =

conn,k

∑

v∈G

conn,k

Len(v, v

base

) (8)

Then, the enclaves, which are the connected compo-

nents that do not contain v

base

, and satisfy the follow-

ing condition become the candidates to be delegated.

dis(G

conn,k

) >

conn,k

· |V

, (9)

where R

is a small positive number to decide the bal-

ance between the size and distance of the connected

components. An enclave G

conn,k

will be delegated to

agent j that has the largest number of nodes adjacent

conn,k

to its RA. This calculation can be done during

i charges.

Then, agent i communicates with agent j so that

i delegates G

conn,k

to j, without considering their ex-

pansion powers. Therefore, j may temporally have a

large RA. However, its expansion power decreases so

that j’s RA is gradually diminished by other agents;

eventually, the overall load will become fair. Note

that all enclaves are not deleted; only small enclaves

far from v

base

will be delegated. Finally, when agent

i moves from enclave G

conn,k

to another G

conn,k

, it

takes the shortest path between two enclaves. There-

fore, i ﬁnds the pair (v,v

) ∈ G

conn,k

× G

conn,k

s.t.

Len(v, v

) is the smallest (if there are more than two

pairs, one of the pairs is selected randomly), and i fol-

lows the shortest path between v and v

. Note that the

pair of length Len(v, v

) is deﬁned as the distance be-

tween two areas and denoted by dist(G

conn,k

We can clearly deﬁne the distance between any sub-

areas in the same way.

5 EXPERIMENTAL RESULTS

We conducted three experiments using three differ-

ent environments (Fig. 1) to verify that our proposed

method does not generate unnecessary fragments of

RAs in various environments. We set A = {1,2,3,4}

(agent i is indicated as Agent+i in all ﬁgures below).

Other initial parameters in this experiment are listed

in Table 1. All data shown below are averages for

every 3600 steps (because the maximum cycle of op-

eration and charging is 3600 steps) taken from 100

independent experimental runs.

To compare the performance with those of the

conventional method (Kato and Sugawara, 2013), we

adopted the DDFE method to explore. Because the

Effective Area Partitioning in a Multi-Agent Patrolling Domain for Better Efﬁciency

285

Table 1: Agent’s Parameters.

Description Parameter Value

Parameters for area expansion R

0.4

γ 300

avoid

Ratio for enclave negotiation R

0.005

Battery consumption per step B

drain

Battery capacity B

max

900

Charging constant k

= k

Radius of initial RA d

init

(a) Uniform environment

(b) Environment with obsta-

cles

Figure 1: Three Environments.

DDFE assumes that the RA is connected (its perfor-

mance considerably decreases if not), we connected

the fragmented RAs with paths between the nearest

area; then, the agent moves in its RA with the con-

nected paths. Let us set G

temp

= G

conn

. Agent i

selects two connected components G

and G

from

temp

, whose distance is the smallest and connects

them with one of the shortest path. Then, G

and

are removed from G

temp

and instead, union of

∪G

∪{n ∈ V |n is on the connected path} is added

to G

temp

. Agent i repeats this process until G

temp

be-

comes a singleton. The generated area consisting of

temp

is used as the connected RA of i.

5.1 Uniform Environment

In the ﬁrst experiment, we evaluate the proposed

method by comparing its performance with that by

the conventional method in the uniform environment

where the event probability is identical in all nodes

and deﬁned as p

= 2×10

−4

for ∀v ∈V . The structure

of the environment is shown in Fig. 1a, in which there

is only one connection node between the corridor and

each of the six rooms. Therefore, assuming that the

Figure 2: Remaining events D(V ) in uniform environment.

Figure 3: Number of enclaves (uniform environment).

RA must be connected, each room must be patrolled

by one agent, whereas the number of agents is four,

making fair assignment of responsibilities impossible.

Conversely, if we exclude this assumption, many en-

claves are generated because no agent is looking at

the whole state of the RAs, resulting in undesirable

division in the conventional method. The locations of

the battery charging bases for individual agents are in

one of the wide spaces at the ends of the corridor as

shown in Fig. 1a.

Figure 2 plots the number of remaining events in

the environment every 3,600 steps over time. Note

that the smaller the number of events, the better is

the method. This ﬁgure indicates that the proposed

method exhibited slightly better performance than the

conventional method. In earlier steps, the perfor-

mance of the proposed method seems lower, but this

is caused by reallocating the enclaves to other agents,

so agents’ workloads are temporally not balanced.

We counted the number of enclaves (including the

connected component containing v

base

) of the RA of

each agent; this is shown in Fig. 3. This ﬁgure indi-

cates that the number of enclaves of the conventional

method gradually increased over time because in the

area expansion, agents tried to include nodes that is

decided only by the local viewpoint; thus, this expan-

sion process may split the RA of other agents. Con-

versely, agents with the proposed method allocated

the fragmented enclaved areas to other agents, even

if such areas were generated, and thus, they can sup-

press the increase of the number of RAs. It should

be noted that because |A| = 4, at least four connected

components are necessary. Furthermore, in the exper-

imental environment, the disconnected enclaves are

mandatory to cover the entire environment.

Finally, we investigated how environment is di-

ICAART 2021 - 13th International Conference on Agents and Artiﬁcial Intelligence

286

(a) Conventional method

(b) Proposed method

Figure 4: Distribution map of RAs (uniform environment).

(a) Biased event probabilities

(b) With obstacles

Figure 5: Number of remaining events (D(V )).

(a) Biased event probabilities

(b) With obstacles

Figure 6: Number of enclaves.

vided into the RAs and how they are allocated to

agents over time. The distribution maps of the en-

claves of the RAs of all agents are shown in Fig. 4.

Figure 4a indicates that RAs were fragmented in the

whole environment, especially spaces near the battery

bases, throughout the experiment. This was probably

because the competition of RAs occurred frequently

there, even if the size of the allocated areas were al-

most identical. By contrast, Fig. 4b shows the RAs are

not unnecessarily divided, and thus, no fragmented ar-

eas seemed to exist. Note that the RAs were always

varying. Of course, we can suspend the behavior of

area expansion at a sufﬁcient point if the environment

is static. However, if the environment is dynamic and

unexpected events, such as failure of an agent, intro-

duction of new agents and deployment of obstacles,

occur, agents should continue to perform the area ex-

pansion behavior to adapt to the changes.

5.2 Non-uniform Environment

We conducted the same experiments in two differ-

ent types of environments. First, is the biased en-

vironment (Fig. 1c), where there are a number of

speciﬁc regions where more events are likely to oc-

cur or which are more important regions that agents

must visit more frequently. The second environment

(Fig. 1b) has a number of walls and obstacles in all

rooms. The purpose of these experiments are to check

if the proposed method can avoid the unnecessary,

small, and disconnected RAs even in more compli-

cated environments. Note that the event probabilities

of white orange and red nodes are p

= 2.0 × 10

−4

= 2.0 × 10

−3

and p

= 2.0 × 10

−2

in Fig. 1.

Their performances, i.e., the values of D(V ), are

plotted in Fig. 5. This ﬁgure shows that the pro-

posed method outperformed the conventional method

in both environments by suppressing unnecessary

fragmented enclaves; we can also see this fact in both

cases from Fig. 6. Actually, the number of enclaves

was stable in the proposed method whereas it gradu-

ally increased over time in the conventional methods.

Figure 5 also indicates that the convergence speed of

agents with the proposed method was slightly slower;

this is also the result of the negotiation to allocating

enclaves to more appropriate agents because, in ear-

lier stages, agents generated more enclaves aggres-

sively by their area expansion processes.

Finally, we generated maps to see how all RAs

changed over time in these experiments; these maps

are shown in Figs. 7 and 8. They show that the pro-

posed method considerably reduces the number of en-

claves by eliminating unnecessary fragments even if

there are no global observers that monitor the entire

environment. We have to note that because the pro-

posed method suppressed the number of unnecessary

enclaves, the degree of balance in the sizes of RAs of

four agents slightly decreased; however, this is quite

small and the resulting efﬁciency was improved, so

we believe that it can be ignorable.

6 CONCLUSION

In this paper, we discuss a method to cover a large en-

vironment using multiple agents by partitioning it in a

bottom-up manner to achieve fair and efﬁcient coop-

erative executions of the continuous multi-agent pa-

trolling problem. Although there are some studies to

Effective Area Partitioning in a Multi-Agent Patrolling Domain for Better Efﬁciency

287

(a) Conventional method

(b) Proposed method

Figure 7: Distribution map of RAs (biased environment).

(a) Conventional method

(b) Proposed method

Figure 8: Distribution map of RAs (biased environment).

achieve the balanced collaboration by area division,

they often generated fragmented RAs due to the de-

centralized control wherein no one agent can see the

entire situation, thus, resulting in inefﬁcient coopera-

tive work. We proposed a method in which agents do

not generate unnecessary enclaves of RAs by allocat-

ing fragmented parts of the RA to more appropriate

agents through communication. The results indicated

that our proposed method could reduce the unneces-

sary enclaves of the RAs and thus, could achieve efﬁ-

cient cooperative work in the various environments.

We would like to examine more complicated en-

vironments to apply our method to more realistic do-

mains. We also plan to further improve the shape of

RAs, especially eliminating enclaves or ﬁll recesses

in concave areas, for more efﬁciency.

ACKNOWLEDGEMENT

This paper is partly supported by JSPS KAKENHI

grant number 17KT0044.

REFERENCES

Ahmadi, M. and Stone, P. (2005). Continuous area sweep-

ing: A task deﬁnition and initial approach. In Proc.

of 12th Int. Conf. on Advanced Robotics (ICAR 2005),

IEEE, pages 316–323.

Ahmadi, M. and Stone, P. (2006). A multi-robot system

for continuous area sweeping tasks. In Proc. of 2006

IEEE Int. Conf. on Robotics and Automation (ICRA

2006), pages 1724–1729.

Carrillo, P. and Rapp, B. (2020). Stochastic multi-robot pa-

trolling with limited visibility. Journal of Intelligent

& Robotic Systems, 97(2):411–429.

Elor, Y. and Bruckstein, A. (2009). Multi-a(ge)nt graph

patrolling and partitioning. In Proc. of the 2009

IEEE/WIC/ACM Int. Joint Conf. on Web Intelligence

and Intelligent Agent Technology, Vol. 2, IEEE Com-

puter Society, pages 52–57.

Huang, L., Zhou, M., Hao, K., and Hou, E. (2019). A sur-

vey of multi-robot regular and adversarial patrolling.

IEEE/CAA Journal of Automatica Sinica, 6(4):894–

903.

Kato, C. and Sugawara, T. (2013). Decentralized area parti-

tioning for a cooperative cleaning task. In Proc. of the

16th Int. Conf. on Principles and Practice of Multi-

Agent Systems (PRIMA-2013), pages 470–477.

Nasir, A., Salam, Y., and Saleem, Y. (2016). Multi-

level decision making in hierarchical multi-agent

robotic search teams. The Journal of Engineering,

2016(11):378–385.

Sugiyama, A., Wu, L., and Sugawara, T. (2019). Improve-

ment of multi-agent continuous cooperative patrolling

with learning of activity length. Agents and Artiﬁcial

Intelligence, pages 270–292, Cham. Springer Int. Pub-

lishing.

Yoneda, K., Kato, C., and Sugawara, T. (2013). Au-

tonomous learning of target decision strategies

without communications for continuous coordinated

cleaning tasks. In IEEE/WIC/ACM Int. Confs. on Web

Intelligence and Intelligent Agent Technology, pages

216–223.

ICAART 2021 - 13th International Conference on Agents and Artiﬁcial Intelligence

288