TOWARDS A FRAMEWORK FOR MANAGEMENT OF STRATEGIC

INTERACTION

Rustam Tagiew

Institute for Computer Science of TU Bergakademie Freiberg, Germany

Keywords:

Game theory, Multi-agent architecture, Human behavior, Domain-speciﬁc languages.

Abstract:

Our research aim is to construct a software framework and associated language for deﬁnition, providing and

recording of strategic interactions between real-world agents, human and artiﬁcial respectively. In this paper

we present an example of such interaction, which is used to show designed and partially implemented concepts.

We use FIPA based framework for our multi-agent system. The investigated scenario is a repeated two player

zero sum symmetric matrix game. We also conducted a study and analyzed the data.

1 INTRODUCTION

A strategic interaction (SI) are events, which happen,

if participants are interested in achieving something

and their success is depending on decisions of other

participants. The specialty of SI (called also game)

in real-world domain is the partial non-appearance

of game theoretical predictions (Tagiew, 2008). Es-

pecially in SI of humans, researchers use special

heuristics to model this deviation. This heuristics

are needed for programing advanced artiﬁcial agents,

who interact with humans. Our view is to construct

a prototype system, which helps to improve and stan-

dardize this heuristics from AI point of view. The sys-

tem is expected to provide and record human-human-

interaction through allocation of proxy agents for hu-

mans, which communicate over network. The behav-

ior of the system is based on the deﬁnition of the in-

teraction rules in special language and it also provides

an ability for deﬁnition of artiﬁcial agents. Artiﬁcial

agents can be tested against humans or each other. A

state-of-art multi-agent architecture is considered for

implementation. Assembling of concepts for build-

ing such a system on a concrete example is the aim

of this paper. Further we assume that the reader is fa-

miliar with game theory. After talking about games

and related works, we present a concept of using of a

multi-agent system (ch. 4) and a concept for deﬁning

SI (ch. 5). Then, we present the study.

2 GAMES

Our concrete example is using of MSE by humans.

For humans, there are two problems to use mixed

strategy equilibrium (MSE) in repeated matrix games.

They are exact calculation of MSE and production

of an identical and independent distributed (iid) se-

quence (F.Camerer, 2003). Humans use a distribution

that is near to MSE and an equal distribution and they

avoid repetitions in the sequence. It is also interesting

to record human behavior in games with not only one

MSE. A set of games similar to Roshambo is chosen,

because they are fast to explain for test persons, fast to

play and have a simple structure for analyze of basic

concepts. These games are symmetric, zero sum, two

player,and the entries of the payoff matrix can be only

1(win), 0(draw) and −1(loss). The only difference to

simple Roshambo is that we use more than three ges-

tures. The payoff matrices of these games can be also

seen as adjacency matrices of directed graphs (set −1

to 0). There are 7 directed graphs for 3 gestures, 42

for 4, 582 for 5 and so on (Harary, 1957). We proved

all graphs for 5 gestures to ﬁnd variants with only one

MSE non-equal distribution. There are 4 such vari-

ants (Fig.1). The ﬁrst variant is an existing American

variant of Roshambo with additional gestures ﬁre and

water. We merged 3 of these graphs with probabili-

ties

and

and a 4-nodes cyclic graph to a graph of 9

nodes. Every node became a plausible name and hand

gesture. The resulting game framework is shown in

Fig.2. The cyclic 4-gesture variant Paper-Mosquito-

Hunter-Monkey has as MSE solution all distributions

on the line between points (0,

, 0,

) and (

, 0,

, 0).

587

Tagiew R. (2009).

TOWARDS A FRAMEWORK FOR MANAGEMENT OF STRATEGIC INTERACTION.

In Proceedings of the International Conference on Agents and Artiﬁcial Intelligence, pages 587-590

DOI: 10.5220/0001536505870590

 SciTePress



















]]:



]]:



]]:

Figure 1: All nontrivial variants for 5 gestures.

Figure 2: Game framework.

3 RELATED WORKS

ACE is a general concept of software architecture for

investigation of economic processes(Tesfatsion and

Judd, 2006). ACE is a design for a multiagent sys-

tem consisting of four kinds of agents: worlds, mar-

kets, ﬁrms and consumers. Worlds and markets man-

age environmental data like ownership and current

prices. GDL is a logic-based language for deﬁnition

of a server for managing games between multiple pro-

grams(Genesereth et al., 2005). Finite games of per-

fect information are deﬁnable in GDL. Then, there is

a state-of-art games solving program GAMBIT.

4 FRAMASI

JADE (Bellifemine et al., 2001) is used for our frame-

work for management of SI (FRAMASI). Three types

of agents are deﬁned: periphery agents, worlds and

participants (similar to ACE). A world represents the

SI itself. Participants search for worlds and then con-

tract with a world, about how and at which place

to participate. After closing contracts worlds supply

the associated participants with (partial) information

about the state of the interaction. Participants send

their actions to the world. Worlds die after fullﬁlled

interaction. A couple of periphery agents add some

extended abilities to our system like gesture recogni-

tion.

5 RUNNING PETRI NET

The idea for a deﬁnition language for practical rep-

resentation of SI in ﬁnite and discrete domains is us-

ing petri nets (PN). PN are usefull for representing

distributed processes (Priese and Wimmel, 2008). A

PN is a labelled directed graph, formally represented

through tuple:

PN = (P, Q, F, W, M) (1)

It consists of places P (circles), which can be ﬁlled

with a positive amount of tokens (dots), and transi-

tions Q (rectangles), which can be ﬁred with an effect

on places. Places and transitions are disjunct P∩ Q =

⊘. Places can not be connected to places and transi-

tion not to transitions. Arcs F ⊆ (P×Q)∪(Q×P) are

weighted with positive natural numbers W : F → N

A transition can not be ﬁred, when any place of in-

coming arcs have less tokens than its arc weight. Fir-

ing abolishes tokens of every incoming arc and pro-

duces tokens for every outgoing arc according to their

weights. The ﬁll levels of the places represent the

state of the petri net. M ∈ N

|P|

is the current state

of PN. To model SI, PN is extended with tuple:

SI = (I, C, N, D, A, O, H, B) (2)

I is a set of agents, empty element ε stands for nature

or world accordingly. C ⊆ Q

∗

is a subset of sequences

of transitions, called choice sets. Every transition is

a member of only one element of C. N : C → N is a

numbering function and not injective. D : N → (R

)

is a function for ﬁring probability distribution in

a choice set, where

∑

(D( )) = 1. n is number of

elements of the related choice sets. O : N → I ∪ ε

denotes ownership. A : Q → R

|I|

is the payoff vector

of a transition, if it ﬁres. H : P → {I} provides for

every place a subset of agents for which it is hidden.

Agents can alter D for own numbers and see all

unhidden places. B : I → R is the current account

balance of agents.

Fig.3 demonstrates a small application of this

formalism for a repeated game ”matching pennies”. It

is representative for our scenario. Dashed diamonds

are taken for the gathered payoff of players. Dashed

boxes are choice sets. Every transition, which is not

in dashed box, is in a single element choice set and

has a ﬁring probability of 1. Altering of D must be

done at speciﬁc time, because it has a default state

and the transitions ﬁring algorithm does not wait

for agents decisions. The places Ready and Time

shows to the participants, if they have or have not to

react. The weight ’latency’ between Time and Timer

represents the time period for simultaneous decisions.

ICAART 2009 - International Conference on Agents and Artificial Intelligence

588

Timer

Payoff4

Payoff1

Payoff2

Payoff3

Alice

Bob

Go1

Ready

Rounds

Go2

latency

tick

time

Alice

Bob

buffer

latency

Figure 3: Matrix Game using PNSI.

6 CONDUCTED STUDY

The feature of gesture recognition is added to our sce-

nario in sense of making it more situated and sim-

ilar to real-world physical activities like sport com-

petitions. We used a cyber-glove and a classiﬁer for

recognition of gesture (Heumer et al., 2008). The

study was conducted on a thread of 200 one-shot

games of 7 kinds (41 to 61 tab.1) using our game

framework (ﬁg.2). A player had a delay for consid-

eration of 6 sec for every shot. If he did not react,

the last or default gesture was chosen. A thread lasted

200 ∗ 6 sec = 20 min and had following structure -

30 times #31, 30x #41, 30x #51, 30x #52, 30x #53,

30x #54 and ﬁnally 20x #61. 10 test persons (com-

puter science undergrates in average 22, 7 years age

and 0.7 male) were recruited, had to play this thread

twice against another test person. In this way we gath-

ered 2000 one-shot games or 4000 single human de-

cisions. Every person got e 0.02 for a winned one-

shot game and e 0.01 for a draw. A winned thread

given additionally e 1 and a drawn e 0.5. Summa-

rized, a person could gain between e 0 and e 10 or in

average e 5. The persons, who played against each

other, sat in two separated rooms. One of the players

used the cyber-glove

and the another a mouse as in-

put for gestures. The proxy agent for a human player

has following features on his gui - own last and actual

choice, opponents last choice, graph of actual game,

timer and already gained money. According to state-

ments of the persons, they had no problems to under-

stand the game rules and to choose a gesture timely.

All winners and 80% of losers attested, that they had

fun to play the game.

except of one participant, who had to play both threads

with mouse, because of mismatch of cyber-glove size to her

hand

Table 1: MSE and observed distributions for all games.

31 Rock Paper Scissors

MSE .3333 .3333 .3333

All .36 .335 .305

Glove .3222 .3407 .3370

Mouse .3909 .3303 .2788

41 Hunter Monkey Paper Mosquito

MSE from 0 .5 0 .5

MSE to .5 0 .5 0

All .2783 .2533 .23 .2383

Glove .2704 .2407 .2444 .2444

Mouse .2848 .2636 .2182 .2333

51 Rock Paper Scissors Fire Water

MSE .1111 .1111 .1111 .3333 .3333

All .1867 .1533 .13 .3267 .2033

Glove .2222 .1555 .1259 .3 .1963

Mouse .1576 .1515 .1333 .3485 .2091

52 Rock Paper Monkey Fire Water

MSE .1429 .1429 .1429 .2857 .2857

All .1783 .155 .1767 .2983 .1917

Glove .1593 .1481 .2 .3 .1926

Mouse .1939 .1606 .1576 .297 .1909

53 Rock Banana Monkey Paper Hunter

MSE .1429 .1429 .1429 .2857 .2857

All .2133 .1417 .2433 .1883 .2133

Glove .1704 .1481 .2555 .2074 .2185

Mouse .2485 .1364 .2333 .1727 .2091

54 Hunter Monkey Paper Mosq. Banana

MSE f. 0 .5 0 .5 0

MSE t. .25 .25 .25 .25 0

All .1833 .3217 .1583 .1833 .1533

Glove .1704 .2963 .1481 .2148 .1704

Mouse .1939 .3424 .1667 .1576 .1394

61 Rock Paper Sciss. Monk. Fire Water

f 0.1111 .1111 .1111 0 .3333 .3333

t .1429 .1429 0 .1429 .2857 .2857

A .185 .0975 .0925 .12 .265 .24

G .2222 .0667 .0944 .1056 .2778 .2333

M .1545 .1227 .0909 .1318 .2545 .2455

7 RESULTS OF THE STUDY

Tab.1 presents the tables of theoretical and played dis-

tributions for every investigated game. For a game,

there are 10 ∗ 2 ∗ 30 = 600 and 10 ∗ 2∗ 20 = 400 for

61 recorded human decisions. In games 51 till 53,

at least one probability deviates with signiﬁcance far

below 1h from MSE towards equal distribution. The

high probability of ﬁre in 51 can be explained as best

response on equal distribution of the opponent. In

game 41, people prefer the middle one of extreme so-

lutions. But it can not be cleanly demonstrated on

games 54 and 61, because of consequences from the-

sis 1. Distribution during playing with glove deviates

from which with mouse. This thesis can be demon-

TOWARDS A FRAMEWORK FOR MANAGEMENT OF STRATEGIC INTERACTION

589

Table 2: Patterns probabilities for 31 and 41.

31 41 Bridges

Pattern The. Obs. The. Obs. 0 1 2

ABCD 0 0 .0938 .2611 0 0 0

ABCA .0741 .15 .0938 .1519 0 0 1

ABCB .0741 .1315 .0938 .0926 0 1 0

ABAC .0741 .1278 .0938 .0926 0 1 0

ABCC .0741 .0667 .0938 .087 1 0 0

ABBC .0741 .0796 .0938 .063 1 0 0

AABC .0741 .0685 .0938 .0722 1 0 0

ABBA .0741 .0611 .0469 .037 1 0 1

ABAB .0741 .0593 .0469 .0333 0 2 0

AABA .0741 .063 .0469 .0222 1 1 1

ABAA .0741 .0574 .0469 .0204 1 1 1

AABB .0741 .0463 .0469 .0222 2 0 0

ABBB .0741 .0352 .0469 .0204 2 1 0

AAAB .0741 .0352 .0469 .0185 2 1 0

AAAA .037 .0185 .0156 .0056 3 2 1

strated only in 5 of 7 games, where at least one of

probabilities for playing with glove signiﬁcantly de-

viates from case, if it is played with mouse. But, there

is no plausible explanation for this deviation at time.

We counted tuples, which consists of own actual deci-

sion plus own three last decisions plus opponents last

three decisions. In game 31 for example, we have 4

tuples, each of them is observed 4 times. As exam-

ple, the tuple {rock, {rock, paper, scissors}, {paper,

rock, paper}} was observed in threads of four defer-

ent persons. The probability for one tuple in 31 to be

observed more than 3 times is about 0.126h. Persons

play in a rhythm which is far from iid. Further they

transform in consequence of this deviation a repeated

matrix game to an extensive game, in which one must

consider his turns in dependence of own and oppo-

nents last turns or in common speech ﬁll the rhythm

of the game.

The latter brought us to calculate probabilistic

grammatic of human behavior in games 31 and 41.

We used methods of (Budescu and Rapoport, 1994).

Variables A, B, C, D, . . . ∈ gestures have all different

values. Patterns are abstract types of sequences and

constructed of these variables. For instance, the se-

quences {paper,rock} and {ﬁre,water} are both of

pattern AB. We used patterns with length 4, because

of limited size of our data. We calculated theoreti-

cal probabilities of these patterns according to thesis,

that humans used equal distributions in games 31 and

41. Per game we can count (30 − 3) ∗ 2 ∗ 10 = 540

sequences and assign them to patterns. Additionally

we counted bridges (variable repetition) over 0, 1, 2

elements in the patterns for possible explanation of

deviation. Tab.2 shows the results.

8 CONCLUSIONS

Our paper showed a practical way for construction

of a highly scaleable multi-agent system for deﬁni-

tion, providing and recording of SI between real-

world agents, human and artiﬁcial respectively. Espe-

cially, we presented a multi-agent system design and

a PN based language. It produced data for a concrete

scenario - using MSE in zero sum by humans. The

recorded data was analyzed and showed results simi-

lar to already alluded in literature. Core elements of

our system can be downloaded from our home page.

REFERENCES

Bellifemine, F., Poggi, A., and Rimassa, G. (2001). Jade: a

ﬁpa2000 compliant agent development environment.

In AGENTS ’01. ACM.

Budescu, D. V. and Rapoport, A. (1994). Subjective ran-

domization in one- and two-person games. Journal of

Behavioral Decision Making, 7.

F.Camerer, C. (2003). Behavioral Game Theory. Princeton

University Press, New Jersey.

Genesereth, M. R., Love, N., and Pell, B. (2005). General

game playing: Overview of the aaai competition. AI

Magazine, 26(2):62–72.

Harary, F. (1957). The number of oriented graphs. Michigan

Math. J., 4.

Heumer, G., Amor, H. B., and Jung, B. (2008). Grasp recog-

nition for uncalibrated data gloves: A machine learn-

ing approach. Presence: Teleoper. Virtual Environ.,

17:121–142.

Priese, L. and Wimmel, H. (2008). Petri-Netze. Springer.

Tagiew, R. (2008). Simplest scenario for mutual nested

modeling in human-machine-interaction. In Advances

in Artiﬁcial Intelligence. Heidelberg, Springer.

Tesfatsion, L. and Judd, K. (2006). Handbook of computa-

tional economics, volume 2. Elsevier/North-Holland.

ICAART 2009 - International Conference on Agents and Artificial Intelligence

590