Predicting the State of Agents Using Motion and

Occupancy Grids

David Ball

1

and Gordon Wyeth

1

1

School of Information Technology and Electrical Engineering, The University of

Queensland, Australia

Abstract. This paper presents an approach to predicting the future state of dy-

namic agents using an extension to occupancy grids that we have called motion

and occupancy grids. These grids capture both the location and motion of an

agent. The basis of the prediction is to model the behaviour of an agent as the

probability of transitions between the states represented in the grid. Prediction

can then be made using a Markov chain. The potential state explosion inherent

in this method is overcome by dynamically building the state transition matrix

in a tree structure. The results show that the system is able to rapidly develop

probabilistic models of complex agents and predict their future state.

1 Introduction

Autonomous systems have the potential to improve their effectiveness by modifying

their behaviour to adapt to the environment. One resource that would be of use is a

prediction of the future state of the dynamic elements in the environment. There has

been significant successful work in regard to modelling static elements. This paper

explores the more difficult problem of modelling dynamic agents and how to predict

their future state. These predictions are to be used to improve the performance of a

system over time. The key to predicting the motion of an agent is to look for patterns

and tendencies in its behaviour and to be able to model these in an appropriate form

for prediction. This research is targeted at domains where the static features of the

environment are relatively simple, but the dynamic features of the environment form

the greatest obstacles to success.

The agent modelling and prediction approach detailed in the paper is intended to

be part of a larger adaptive system [1] that will enable agents to improve their per-

formance against other agents. This architecture has three steps; classify the behav-

iour of the moving elements of the environment using predefined models, produce

models and predict the behaviour of the moving elements for each class, and exploit

by integrating the models and predictions into the existing planning system to im-

prove its effectiveness. This paper addresses the second step – modelling and predict-

ing the behaviour of moving elements.

The approach is to model the behaviour of an agent’s motion in a discrete form us-

ing a motion and occupancy grid where the connections between states are repre-

sented using a probabilistic state transition matrix. Using the Markov assumption, the

future state is predicted using the state transition matrix and the current state. With

Ball D. and Wyeth G. (2005).

Predicting the State of Agents Using Motion and Occupancy Grids.

In Proceedings of the 1st International Workshop on Multi-Agent Robotic Systems, pages 112-121

DOI: 10.5220/0001192301120121

Copyright

c

SciTePress

the motion and occupancy grid method the state vector and the state transition matrix

can be large. This paper presents a method where the state transition matrix is repre-

sented by a dynamically built tree that is efficient to search and use.

The paper is structured as follows. The following section provides relevant back-

ground and presents existing approaches to agent behaviour prediction. Section three

describes the modelling and prediction system. Section four describes the experimen-

tal setup. Section five presents the results of the experiments and discusses them.

Section six presents early work with using the predictions to adapt the behaviour of

an agent. Lastly section seven concludes the paper.

2 Background

This section provides background on the Markov approach to modelling and predict-

ing the future behaviour for agents. It also briefly describes the occupancy grids that

form the basis of the motion and occupancy grids used in this work.

2.1 Markov Chains

A process is said to be Markov if the future state distribution depends only on the

present state, in other words, the future is conditionally independent of the past. A

Markov Chain consists of a finite number of states and a state transition matrix which

contains probabilities of moving between states [2]. So:

)|(),...,,|(

11111100 −−−−

=

=

=

=

=

==

tttttt

iXjXPiXiXiXjXP

(1)

The state of the process can therefore be predicted forward t time steps by multi-

plying the state transition matrix t times. So if X represents the state of the process

and A is the state transition matrix (a stochastic matrix) then to predict the state for-

ward one time step:

tt

XAX

⋅

=

+1

(2)

If there are n states in X then A is of size n × n.

The Markov assumption allows for simple state transition model approaches but

predictions cannot be guaranteed if the assumption is invalid. Typically as the

agent’s architecture becomes more reactive (and hence carries less internal state) a

Markov predictor will increase its performance. Predicting the future state using

Markov Chains is not new. Improvements such as semi-Markov processes (SMP) [2]

have allowed arbitrary state duration and the augmented Markov Models (AMM) [3]

that allows additional statistics in links and nodes so that it can restructure a model.

The AMM is dynamically constructed by adding state nodes as the agent enters pre-

viously unseen states. This differs from the method in this paper where the entire state

is predefined and the state transition matrix is dynamically built. This is useful where

the state space is large as it can be implemented for efficient searching.

2.2 Occupancy Grids

The occupancy grid method [4, 5] provides a probabilistic approach to representing

how a region of space is occupied. It has been successfully used for representing

obstacles for robot navigation. The method does not capture the motion of the obsta-

cles. The method in this paper extends this method so that it can capture and represent

the motion of occupied space or in this case, dynamic agents.

3 Prediction Approach

This section describes the extensions to an occupancy grid required to deal with mov-

ing elements. It explains the problem of state explosion inherent in this method and

how it can be addressed using dynamic state transition trees.

3.1 Motion and Occupancy Grid Method

Fig. 1 shows a representation of the motion and occupancy grid cells. The occupancy

grid is suitable for representing the probability distribution of the future occupied

space for an agent. The environment is divided into grid cells that represent the agent

occupying particular locations. The future state of the agent can be predicted by de-

termining the probability of transitioning between grid cell states. However, this ig-

nores the agent’s motion within the grid cell and different paths the agent may take

through the same grid cell.

Fig. 1. This figure shows how the state of an agent is represented using the motion and occu-

pancy grid. The centre circle represents stationary (the robot staying in same grid cell) and the

cells surrounding the circle represent the possible directions of motion.

The motion grid captures the agent’s current direction of motion (also referred to

as its heading direction) so that it can distinguish between different motion types

through the grid cell. For example, an agent may have two motion paths through a

cell, one from bottom to top and the other from left to right. If the motion of an agent

is not captured the probability distribution will show the predicted motion to the right

and the top. The motion grid allows for the prediction for the two motion types

through the cell to be separated. The centre circle represents an agent that is station-

ary which is an agent that will maintain the same occupancy grid state. The other

cells represent directions of motion for the agent, dividing the domain of possible

directions into equal amounts.

Motion and occupancy grids are fully connected between states, so that each state

has probabilities associated with transitioning to all other states, even if they are zero.

The state transition probabilities are histogram based. The probability update equation

is:

1

1],[

],[

1

1

+

+

×

=

+

+

Xt

Xtttcurrent

ttnew

c

cXXA

XXA

(3)

A[X

t

, X

t+1

] is the element in the state transition matrix that represents the probabil-

ity of transitioning from state X

t

to X

t+1

. c

Xt

is the count of the number of times the

state X

t

has been initial state in a transition. To decrease computation time, the num-

ber of transitions between each state is recorded c

Xt->Xt+1

and so the probability update

equation becomes:

1

1

],[

1

1

+

+

=

+>−

+

Xt

XtXt

ttnew

C

C

XXA

(4)

3.2 Extending the State Vector

Typically an agent’s choice of action is dependent on other agents or other dynamic

features of the environment. For the Markov assumption to hold true, these features

must also be represented in the state vector. This dramatically increases the size of the

state transition matrix. If the predicted agent’s behaviour is dependent on the behav-

iour of another agent, then the state vector would hold not only the predicted agent’s

occupancy and motion grid, but also that of the other agent.

A motion and occupancy grid state vector has four state variables {x, y, v

x

, v

y

}. In

the testing environment in this paper the x direction is divided into 15 columns and

the y direction into 12 rows. Velocity is coarsely binned as being left, stationary or

right for v

x

, and up, stationary or down for v

y

. Consequently there are a total of 15 ×

12 × 3 × 3 = 1620 states that represent the position and motion of the agent. If the

agent is dependent on a dynamic element that uses the same motion and occupancy

grid this becomes 2,624,400 states to represent the two agents.

The state transition matrix is in a squared relationship to the size of the state vec-

tor. As the state vector increases the required number of bytes for a fully defined state

transition matrix quickly becomes larger that is possible to store in physical memory.

Not only can the state transition matrix be too large to store on a typical PC but also

multiplying all the entries (for prediction) can have a large computational cost.

3.3 Dynamic Tree State Transition Matrix

To overcome the limitation associated with the typically large motion and occupancy

grids this approach dynamically populates the state transition matrix. Each new entry

increases the size of the state transition matrix in memory in a linear fashion. A

straightforward approach to populating the matrix would be to dynamically build a

one dimensional list where previously unseen entries are added. However, this would

not be computationally effective as the entire array would need to be searched to

determine if an entry was new. Also, during the prediction phase, the entire array

would need to be searched to find corresponding entries in the state matrix.

This papers approach uses a dynamically constructed tree to represent the state

transition matrix. Each level of the tree represents an index to the state vector based

on one of the state variables. For example, consider the simplified case of an occu-

pancy grid where the grid may be indexed by the x and y variables representing the

location of an agent.

The state transition tree for this agent is shown in Fig. 2. The first level of the tree

represents the x index for all locations the agent has visited so far. As the agent visits

new columns, new x nodes will be created at this level. The second level is indexed

on the y location, again only creating nodes as the agent visits new rows in the grid.

Each node in this level stores the number of times this node has been visited for the

probability update equation (Equation 4). The tree structure is then repeated below to

show the transition probabilities to other grid cells. Again, storage elements are only

created as needed, with the lowest layer of the tree storing the probability of the tran-

sition A[X

t

, X

t+1

] as well as the count of the number of transitions between states C

Xt-

>Xt+1

.

A tree is also dynamically constructed to store the predictions for each time step.

This allows for the state of the agent to be predicted forward computationally effec-

tively. This tree contains branches to represent all states that have been the results of

a transition. This tree has half as many levels as the state transition tree.

Fig. 2. The state transition tree represents the state transition matrix and is dynamically built.

Each level of the tree represents a vector of state variables. The depth is fixed and the prob-

abilities are stored in the lowest level of the tree. Except for the probability layer the nodes

(shown as white squares) contain pointers to variables within the next layers state vector. This

tree is a two state vector tree but the tree can be any even number of levels. The number of

transitions from each state is recorded where it branches from the current state to the next state.

Current

State

Next

State

x

t

x

t+1

y

t

y

t+1

A

[(2,3),(2,1)]

c

(2,3) -> (2,1)

c

(2,3)

c

(2,4)

2

3

43

4

22

4

4

1 4 5

1

A

[(2,3),(2,4)]

c

(2,3) -> (2,4)

A

[(2,4),(2,4)]

c

(2,4) -> (2,4)

A

[(2,4),(4,5)]

c

(2,4) -> (4,5)

4 Experimental Setup

This section describes the testing environment and experimental setup to the test the

performance of the proposed prediction system.

The testing environment is the RoboCup Small Size League [6] and the research

platform is the RoboRoos. In this league both teams have five robots. The rules are

similar to the human version of the game (FIFA). There are two 10 minute halves.

The robots are fully autonomous in the sense that no strategy or control input is al-

lowed by the human operators during play.

The prediction method is tested within the existing robot soccer system. The simu-

lator simulates two teams of robots in line with the 2003 RoboCup Small Size League

rules and regulations. This includes simulating some of the important human referee

tasks such detection of when a goal is scored and restarting play. The simulator has a

very high fidelity with a one millisecond resolution in the simulation of all robot and

ball kinematics, dynamics and collisions. It closely simulates the performance of the

real robots. The vision and intelligence systems are simulated at 60 Hz. Simulations

including the prediction algorithm ran faster than real time on a Pentium-M laptop

computer. The time between the current and next state, one time step, is 250 millisec-

onds (15 frames). The model is updated at every visual update (60 Hz). The agents

are predicted every 83.3 milliseconds which is 5 frames. The agents are predicted

forward 1 second into the future (a significant time in robot soccer) representing 4

steps in the Markov chain. At the agent’s top speed of 1.5 metres/second they can

move approximately 8 grid squares in the predicted time.

The field is divided up so that each occupancy grid cell is approximately the same

size as an agent. This gives 15 x 12 grid squares. The current heading direction is

broken into 8 directions and stationary to form the motion grid. This is converted into

a 3 x 3 matrix where the centre cell (1,1) is for stationary and the other cells hold

appropriate heading directions. An agent is classified as stationary if its velocity is

below a threshold. This threshold is the velocity where an agent is just as likely to

maintain its current occupancy state as it is to change state. This is half of the cell

width divided by the time step time. This gives a threshold velocity of 0.36 me-

tres/second. When the system is unable to make a prediction (the agent is in a previ-

ously unseen state) the system predicts that the agent will continue with its current

motion. If the agent is stationary then it is predicted to maintain its current location.

The system was tested in three different situations of increasing difficulty.

• Figure of Eight Path – This involved an agent tracing out a figure of eight path.

This has no relationship to robot soccer but is used to demonstrate the approach.

• Goal Shooting - This involved one agent interacting with a ball against five oppo-

sition agents that were held stationary in a defensive formation. The player’s ac-

tions were to acquire, move and kick the ball into the opponent’s goal. Results

were recorded for a state transition tree that modelled the dependency on the mo-

tion of the ball and one that doesn’t. This experiment lasted for twenty minutes.

• Full Game – This was a normal game lasting twenty minutes with most of the

rules of the SSL. This is the most difficult domain for the predictor. Each player on

the opposing team was modelled and predicted relative to the motion of the ball.

5 Results

This section first demonstrates the proper operation of the system and highlights an

important benefit of the motion grid. Next the results comparing the predicted and

actual positions are presented and discussed.

Two instances of the agent following the figure of eight path and its predicted fu-

ture probability distribution are shown in Fig. 3. The time between the left figure and

the right figure is equal t the prediction time (1 second), which demonstrates that the

system predicted accurately in this simple case. The right-hand diagram in the figure

demonstrates an important ability of this approach. The agent’s state has been pre-

dicted correctly through the intersection of the figure eight. This is only possible

because the agent’s motion is modelled along with its position. In a system where

only the occupancy is modelled the probability distribution would extend in two di-

rections, left and up.

Fig. 3. This figure shows the agent and its predicted probability distribution for two instances

of the agent traversing the figure of eight path. The path is shown as a dashed line and the

agent is represented by the polygon. The line from the agent shows its current navigation

command. The darkened grid cells show the prediction of the agent’s position, with a darker

cell representing a greater likelihood.

Fig. 4 shows the quantitative results for the figure of eight experiment, comparing

the predicted position of the agent with its true position after the prediction time has

elapsed. The predicted position is a weighted average of the probability distribution

that represents the future occupancy of an agent. The error is the Euclidean distance

between the centres of the predicted and the true cell. The error rapidly decreases

after one complete cycle is made. After two cycles the error stabilises at approxi-

mately half a grid cell representing the quantisation noise of the grid system.

Prediction Error for the Figure of Eight Experiment

0

0.5

1

1.5

2

2.5

3

3.5

4

10 15 20 25 30 35 40 45 50 55 60

Time (Seconds)

Error (Grid Cells)

Fig. 4. Figure of Eight prediction error. The error rapidly decreases after the agent has started

the second cycle and by the third the error has stabilised at approximately half a grid cell.

Fig. 5 shows the prediction error for the Goal Shooting experiment. In the experi-

ment where the agent’s motion is assumed to be independent from the motion of the

ball, the error is generally constant. In the experiment where dependence on the ball

is assumed the error drops through the experiment, ending with a lower average error

than the independence experiment. This demonstrates the importance of modelling

the relationship between dynamic elements when there is a dependence relationship.

Fig. 5. Goal Shooting prediction error. The upper error line shows the effect of assuming that

the prediction is independent of the ball’s motion. A simple, but not very accurate, model of

the motion is quickly learnt. When the ball’s motion is included in the state vector, the learning

is a little slower, but the end result is more accurate.

The ability to predict the motion of opposing agents in a game of robot soccer is a

highly challenging task. Fig. 6 shows the results for the prediction system when com-

pared to a naïve predictor that assumes that the robot will continue move in the same

direction if it is moving or will remain still if it is currently still. This graph shows

that the prediction system consistently outperforms the naïve predictor, with steadily

improving performance through out the match.

Fig. 6. Full Game prediction error. This graph compares the error naively assuming constant

motion with the prediction system error. This shows the prediction systems error decreasing

throughout the experiment.

These results demonstrate that the prediction approach is able to learn and predict the

opponent’s motion. In the Figure of Eight experiment the model size stabilises at

approximately 30 Kilobytes. The Goal Shooting experiment model reaches a size of

4.8 Megabytes and the Full Game experiment model reaches a size of 12.2 Mega-

bytes. These storage sizes are significantly less than for a fully defined state transition

matrix. The models were still significantly growing at the end of the Goal Shooting

and Full Game experiments indicating that the agents were still traversing previously

unseen states. Potentially the error would continue to decrease if the experiments

were run for a longer period of time.

6 Applying the Predictions

While the above results are a measure of the quality of the predictions, they do not

reflect one of the strengths of this approach. The prediction process ends with a prob-

ability distribution representing the agent’s most likely future occupancy as shown in

Fig. 7. This probability distribution is an example from the Goal Shooting experiment

and is just after the ball has been kicked at the opponent’s goal. The grey dotted path

shows the agent’s actual path during the predicted time. Interpreting the distribution,

the agent is most likely to travel from its current position along the dotted path but

may also travel towards the opponent’s goal. There is also a small probability of the

agent maintaining its current position. Measuring the weighted average to the actual

position (as was done in the results in Section 5) does not reflect the usefulness of

knowing the shape of the likely paths that the agent will take. This can only be as-

sessed by exploiting the predictions.

Fig. 7. Typical prediction probability distribution. The darker square cells represent a higher

probability value. There are five opponent agents. The dotted grey line shows the agents mo-

tion during the predicted time. This figure demonstrates a strength of this approach that is not

reflected in the Euclidean distance error results.

The predictions were applied to improving the ability of the agents to score goals.

In robot soccer, one of the important choices for the planner is to determine the loca-

tion that is the best for scoring goals. The MAPS planner [7] used by the RoboRoos

considers the current position of the opposing agents and the clear paths to goal when

choosing a shooting location. Instead of treating the opponents as crisp positions, the

probability distributions developed from the predictions of opposing agents were

used.

When using offence strategies involving passing to the shooting player, or creating

opportunities for the shooting player to move into space the adapting team was more

likely to win the game. However, when using strategies that were designed to disrupt

the defence to create goal scoring opportunities the adapting team was more likely to

lose the game. This initial exploitation work has raised the following issues:

• The choice of overall strategy to combine with the predictions is important. For

example using strategies designed to disrupt the regular motion of the agents under

prediction will cause poor modelling and prediction performance.

• There is an inconsistency in basing the choice of action on a prediction that will be

invalid if the choice of action does not lead to the prediction.

• While the opponents may not be adapting their strategy using an agent model, their

motion is likely to change as the choice of action of the predicting agent changes.

7 Conclusion

This paper has presented an approach to predicting the future state of agents using

motion and occupancy grids and the Markov Chain method. It was shown that this

method is able to separate different motion paths. A method of storing the typically

large state transition matrices by dynamically building a tree representation was pre-

sented that enables the typically large state spaces. The results demonstrate that the

approach is able to separate different motion paths and they show the importance of

modelling the agent’s dependencies. The error from the prediction system decreases

as the experiments continue. Future work will further explore methods of using the

predictions to improve the performance of an autonomous planning system.

References

1. Ball, D. and G. Wyeth. Modeling and Exploiting Behavior Patterns in Dynamic Environ-

ments. in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems

(IROS). 2004. Sendai, Japan.

2. Ross, S.M., Applied Probability Models with Optimization Applications. 1992, New York:

Dover Publications, Inc.

3. Goldberg, D. and M.J. Mataric. Reward Maximization in a Non-Stationary Mobile Robot

Environment. in The Fourth International Conference on Autonomous Agents. 2000. Barce-

lona, Spain.

4. Elfes, A.E. Occupancy grids: A stochastic spatial representation for active robot perception.

in Sixth Conference on Uncertainty in AI. 1990.

5. Moravec, H.P. and A.E. Elfes. High resolution maps from wide angle sonar. in Proceedings

of the 1985 IEEE International Conference on Robotics and Automation. 1985. Washington,

D.C, USA.

6. Kitano, H., et al. RoboCup: The Robot World Cup Initiative. in IJCAI-95 Workshop on

Entertainment and AI/ALife. 1995.

7. Ball, D. and G. Wyeth. Multi-Robot Control in Highly Dynamic, Competitive Environ-

ments. in RoboCup 2003. 2003. Padua, Italy: Springer Verlag.