Market Impact in Trader-agents: Adding Multi-level Order-flow
Imbalance-sensitivity to Automated Trading Systems
Zhen Zhang
a
and Dave Cliff
b
Department of Computer Science, University of Bristol, Bristol BS8 1UB, U.K.
Keywords:
Market Impact, Adaptive Trader Agents, Financial Markets, Multi-Agent Systems.
Abstract:
Financial markets populated by human traders often exhibit so-called “market impact”, where the prices
quoted by traders move in the direction of anticipated change, before any transaction has taken place, as
an immediate reaction to the arrival of a large (i.e., “block”) buy or sell order in the market: traders in the
market know that a block buy order is likely to push the price up, and that a block sell order is likely to push
the price down, and so they immediately adjust their quote-prices accordingly. In most major financial markets
nowadays very many of the participants are “robot traders”, autonomous adaptive software agents, rather than
humans. This paper addresses the question of how to give such trader-agents a reliable anticipatory sensitivity
to block orders, such that markets populated entirely by robot traders also show market-impact effects. This
is desirable because impact-sensitive trader-agents will get a better price for their transactions when block
orders arrive, and because such traders can also be used for more accurate simulation models of real-world
financial markets. In a 2019 publication Church & Cliff presented initial results from a simple deterministic
robot trader, called ISHV, which was the first such trader-agent to exhibit this market impact effect. ISHV
does this via monitoring a metric of imbalance between supply and demand in the market. The novel contri-
butions of our paper are: (a) we critique the methods used by Church & Cliff, revealing them to be weak, and
argue that a more robust measure of imbalance is required; (b) we argue for the use of multi-level order-flow
imbalance (MLOFI: Xu et al., 2019) as a better basis for imbalance-sensitive robot trader-agents; and (c) we
demonstrate the use of the more robust MLOFI measure in extending ISHV, and also the well-known AA
and ZIP trading-agent algorithms (which have both been previously shown to consistently outperform human
traders). Our results demonstrate that the new imbalance-sensitive trader-agents introduced in this paper do
exhibit market impact effects, and hence are better-suited to operating in markets where impact is a factor of
concern or interest, but do not suffer the weaknesses of the methods used by Church & Cliff. We have made
the source-code for our work reported here freely available on GitHub.
1 INTRODUCTION
Financial markets populated by human traders of-
ten exhibit so-called market impact, where the prices
quoted by traders shift in the direction of antici-
pated change, as a reaction to the arrival of a large
(i.e., “block”) buy or sell order for a particular as-
set: that is, mere knowledge of the presence of the
block order is enough to trigger a change in the
traders’ quote-prices, before any transaction has ac-
tually taken place, because the traders know that a
block buy order is likely to push the price of the asset
up, and a block sell order is likely to push the price
down, and so they adjust their quote-prices accord-
a
https://orcid.org/0000-0002-4618-6934
b
https://orcid.org/0000-0003-3822-9364
ingly, in anticipation of the shift in price that they
foresee coming as a consequence of the block-trade
completing. This is bad news for the trader trying to
buy or sell a block order: the moment she reveals her
intention to buy a block, the market-price goes up;
the moment she reveals her intention to sell, the price
goes down. From the perspective of a block-trader,
the market price moves against her, whether she is
buying or selling, and this happens not because of the
price she is quoting, but because of the quantity that
she is attempting to transact.
Block-traders’ collective desire to avoid market
impact has long driven the introduction of automated
trading techniques such as “VWAP engines” (which
break block orders into a sequence of smaller sub-
orders that are released into the market over a set pe-
riod of time, with the intention of achieving a specific
426
Zhang, Z. and Cliff, D.
Market Impact in Trader-agents: Adding Multi-level Order-flow Imbalance-sensitivity to Automated Trading Systems.
DOI: 10.5220/0010391004260436
In Proceedings of the 13th International Conference on Agents and Artificial Intelligence (ICAART 2021) - Volume 2, pages 426-436
ISBN: 978-989-758-484-8
Copyright
c
2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
volume-weighted average price, hence VWAP), and
has also driven the design of major new electronic ex-
changes such as London Stock Exchange’s (LSE’s)
Turquoise Plato trading venue (London Stock Ex-
change Group, 2019), in which block-traders are al-
lowed to obscure the size of their blocks in a so-called
dark pool market, with LSE’s automated matching
engine identifying one or more willing counterparties
and only making full details of the block-trade known
to all market participants after it has completed: see
(Petrescu and Wedow, 2017) for further discussion.
Many of the world’s major financial markets now
have very high levels of automated trading: in such
markets most of the participants, the traders, are
“robots” rather than humans: i.e., software systems
for automated trading, empowered with the same le-
gal sense of agency as a human trader, and hence
“software agents” in the most literal sense of that
phrase. Given that these software agents typically
replace more than one human trader, and given that
those human traders were widely regarded to have re-
quired a high degree of intelligence (and remunera-
tion) to work well in a financial market, it is clear that
the issue of designing well-performing robot traders
presents challenges for research in agents and artifi-
cial intelligence, and hence is a research topic that is
central to the themes of the ICAART conference.
This paper addresses the question of best how to
give robot traders an appropriate anticipatory sensi-
tivity to large orders, such that markets populated en-
tirely by robot traders also show market-impact ef-
fects. This is desirable because the impact-sensitive
robot traders will get a better price for their transac-
tions when block orders do arrive, and also because
simulated market populated by impact-sensitive au-
tomated traders can be studied to explore the pros
and cons of various impact-mitigation or avoidance
techniques. We show here that well-known and long-
established trader-agent strategies can be extended by
giving them appropriately robust sensitivity to the im-
balance between buy and sell orders issued by traders
on the exchange, orders that are aggregated on the
market’s limit order-book (LOB), the data-structure
at the heart of most electronic exchanges.
To the best of our knowledge, the first paper to
report on the use of an imbalance metric to give auto-
mated traders impact-sensitivity was the recent 2019
publication by Church & Cliff, in which they demon-
strated how a minimal nonadaptive trader-agent called
Shaver (which, following the convention of practice
in this field, is routinely referred to in abbreviated
form via a psuedo-ticker-symbol: “SHVR”) could be
extended to show impact effects by addition of an im-
balance metric, and Church & Cliff gave the name
ISHV to their Imbalance-SHVR trader-agent (Church
and Cliff, 2019). SHVR is a trader-agent strategy
built-in to the popular open-source financial exchange
simulator called BSE (BSE, 2012), which Church &
Cliff used as the platform for their research. Al-
though Church & Cliff deserve some credit for the
proof-of-concept that ISHV provides, we argue here
that the imbalance-metric they employed is too frag-
ile for practical purposes because very minor changes
in the supply and demand can cause their metric to
swing wildly between the extremes of its range. One
of the major contributions of our paper here is the
demonstration that a much better, more robust, metric
known as multi-level order-flow imbalance (MLOFI)
can be used instead of the comparatively very frag-
ile metric proposed by Church & Cliff. Another ma-
jor contribution of this paper is our demonstration
of the addition of MLOFI-based impact-sensitivity to
the very well-known and widely cited public-domain
adaptive trader-agent strategies ZIP (Cliff, 1997) and
AA (Vytelingum, 2006; Vytelingum et al., 2008).
Although our primary aim was to add impact sen-
sitivity to these two machine-learning-based trader-
agent strategies, we also demonstrate in this paper that
ISHV can be altered/extended to use MLOFI, and our
improvement of Church & Cliffs work in that regard
is an additional contribution of this paper.
The extended versions of the AA, ZIP, and ISHV
trader-agent strategies that we introduce here are
named ZZIAA, ZZIZIP, and ZZISHV respectively.
In this paper, after our criticism of Church & Cliffs
methods, we described more mathematically sophisti-
cated approaches to measuring imbalance, which are
more robust, and which we incorporate into our agent
extensions. We then present results from testing our
extended trader-agents on BSE, the same platform
that was used in Church & Cliffs work. Full de-
tails of the work reported here are available in (Zhang,
2020b), and all the relevant source-code has been
made freely available on GitHub (Zhang, 2020a).
Section 2 of this paper presents an overview of
relevant background material: readers already famil-
iar with automated trading systems and contemporary
electronic financial exchanges can safely skip ahead
straight to Section 3. In Section 3 we give a brief sum-
mary of Church & Cliffs 2019 work and then provide
our detailed critique of their core method, which we
demonstrate to be significantly lacking in robustness,
and we then describe our MLOFI approach in detail.
Then Section 4 is where we describe the steps taken
to add MLOFI-based impact-sensitivity to ZIP, AA,
and ISHV; and the results from those extended trad-
ing algorithms are presented in Section 5.
Market Impact in Trader-agents: Adding Multi-level Order-flow Imbalance-sensitivity to Automated Trading Systems
427
2 BACKGROUND
Since the mid-1990s researchers in universities and
in the research labs of major corporations such as
IBM and Hewlett-Packard have published details of
various strategies for autonomous trader-agents, of-
ten incorporating AI and/or machine learning (ML)
methods so that the automated trader can adapt its
behaviors to prevailing market conditions. Notable
trading strategies in this body of literature include:
Kaplan’s “Sniper” strategy (Rust et al., 1992); Gode
& Sunder’s ZIC (Gode and Sunder, 1993); the ZIP
strategy developed at Hewlett-Packard (Cliff, 1997);
the GD strategy reported by Gjerstad & Dickhaut
(Gjerstad and Dickhaut, 1997) the MGD and GDX
automated traders developed by IBM researchers
(Tesauro and Das, 2001; Tesauro and Bredin, 2002);
Gjerstad’s HBL (Gjerstad, 2003); Vytelingum’s AA
(Vytelingum, 2006; Vytelingum et al., 2008)); and
the Roth-Erev approach (see e.g. (Pentapalli, 2008)).
However, for reasons discussed at length in a recent
review of key papers in the field (Snashall and Cliff,
2019) this sequence of publications concentrated on
the issue of developing trading strategies for orders
that all had the same fixed-size quantity, and that
quantity was always one. That is, none of the key pa-
pers listed here deal with trading strategies for outsize
block orders, and none of them directly explore the is-
sue of how an automated trader can best deal with, or
avoid, market impact.
Trader-agent strategies such as Sniper, ZIC, ZIP,
GD and MGD were all developed to operate in elec-
tronic markets that were based on old-school open-
outcry trading pits, as were common on major fi-
nancial exchanges until face-to-face human-to-human
bargaining was replaced by negotiation of trades via
electronic communication media; but more recent
work has concentrated on developing trading agents
that issue bids and asks (i.e. quotations for orders
to buy or to sell) to a centralised electronic exchange
(such as a major stock-market like NYSE or NAS-
DAQ or LSE) where the exchange’s matching engine
then either matches the trader’s quote with a willing
counterparty (in which case a transaction is recorded
between the two counterparties, the buyer and the
seller) or the quote is added to a data-structure called
the Limit Order Book (LOB) that is maintained by
the exchange and published to all traders whenever
it changes. The LOB aggregates and anonymises all
outstanding orders: it has two sides or halves: the bid-
side and the ask-side. Each side of the book shows
a summary of all outstanding orders, arranged from
best to worst: this means that the bid-side is arranged
in descending price-order, and the ask side is arranged
in ascending price-order, such that at the “top of the
book” on the two sides the best bid and ask are visi-
ble. For all orders currently sat on the LOB, if there
are multiple orders at the same price then the quanti-
ties of those orders are aggregated together, and often
multiple orders at the same price will be later matched
with a counterparty in a sequence given by the orders’
arrival times, in a first-in-first-out fashion. The public
LOB shows only, for each side of the book, the prices
at which orders have been lodged with the exchange,
and the total quantity available at each of those prices:
if no orders are resting at the exchange for a particular
price, then that price is usually omitted from the LOB
rather than being shown with a corresponding quan-
tity of zero. Illustrations of LOBs appear later in this
paper, commencing with Figure 2.
The difference between the price of the best bid
on the LOB at time t and the price of the best ask at t
is known as the spread. The mid-point of the spread
(i.e. the arithmetic mean of the best bid and the best
ask) is known as the mid-price which is denoted here
by P
mid
. The mid-price is very commonly used as a
single-valued statistic to summarise the current state
of the market, and as an estimate of what the next
transaction price would be. However, the midprice
pays no attention to the quantities that are bid and of-
fered. If the current best bid is for a quantity of one at
a price of $10 and the current best ask is for a quantity
of 200 at a price of $20 then the mid-price is $15 but
that fails to capture that there is a much larger quantity
being offered than being bid: basic microeconomics,
the theory of supply and demand, would tell even
the most casual observer that with such heavy sell-
ing pressure then actual transaction prices are likely
to trend down in which case the mid-price of $15
is likely to be an overestimate of the next transaction
price. Similarly, if the bid and ask prices remain the
same but the imbalance between supply and demand
is instead reversed, then the fact that there is a re-
vealed desire for 200 units to be purchased but only
one unit on sale at the current best ask would surely
be a reasonable indication that transaction prices are
likely to be pushed up by buying pressure, in which
case the mid-price of $15 will turn out to be an un-
derestimate. This lack of quantity-sensitivity in the
mid-price calculation leads many market practition-
ers to instead monitor the micro-price, denoted here
by P
micro
, which is a quantity-weighted average of the
best bid and best ask prices, and which does move in
the direction indicated by imbalances between supply
and demand at the top of the LOB: see, e.g., (Cartea
et al., 2015).
To the best of our knowledge the first impact-
sensitive trading algorithm was ISHV (Church and
Cliff, 2019). ISHV is based on the SHVR trader
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
428
built into the popular BSE public-domain financial-
market simulator (BSE, 2012; Cliff, 2018). A SHVR
trader simply posts the buy/sell order with its price set
one penny higher/lower than the current best bid/ask.
This single instruction gives it a parasitic nature, in
the sense that it can mimic the price-convergence be-
haviour of other strategies being used by other traders
in the market.
Instead of shaving the best bid or offer by one
penny, Church & Cliffs ISHV trader instead chooses
to shave by an amount s which varies with m de-
fined in Equation 1:
m = P
micro
P
mid
(1)
The difference of the micro-price and the mid-price
can identify the degree of supply/demand imbalance
to a useful extent. If m 0, there is no obvious im-
balance in the market. If m < 0, then the quantities
of the best bid and the best offer on the LOB indicate
that supply exceeds demand and the subsequent trans-
actions prices are likely to decrease; whereas m > 0
indicates that demand outweighs supply and subse-
quent transaction prices will have an upward ten-
dency.
The pseudocode for ISHV is shown in Figure 1.
It implements a function that maps from m to s to
determine how much it will shave off its price. For a
buyer, if m < 0, it knows the price will shift in its
favour and shaves its price as little as possible (the
exchange’s minimum tick-size p often one penny
or one cent is chosen). However, if m > 0, ISHV
“believes” the later prices will be worse and attempts
to shave a large amount off (Cp + Mmp). C and
M are two constants that determine the SHVR’s re-
sponse to the imbalance (they are the yintersect and
gradient for a linear response function; nonlinear re-
sponse functions could be used instead). Church &
Cliff showed that ISHV can identify and respond ap-
propriately to the presence of a block order signal at
the top of the LOB.
Church & Cliff were careful to flag their ISHV
trader as only a proof-of-concept (PoC): ISHV was
developed to enable the study of coupled lit/dark trad-
ing polls such as LSE Turquoise Plato system in com-
mercial operation in London, as mentioned in the In-
troduction to this paper. Without impact-sensitive
trader-agents, it is not possible to build agent-based
models of contemporary real-world trading venues
such as LSE Turquoise Plato. Having experimented
further with Church & Cliffs PoC system, we came
to realise that there are severe limitations in ISHV as
described in Figure 1: these limitations stem from the
fact that Equation 1, which is at the heart of ISHV,
uses values only found at the top of the LOB. That
Figure 1: Pseudocode for the bidding behavior of ISHV,
source from (Church and Cliff, 2019).
is, Equation 1 involves only the price and quantity of
the best bid and the best ask. As we will demonstrate
in the next section, this makes the method introduced
by Church & Cliff so fragile that it is unlikely to be
usable in anything but the simplest of simulation stud-
ies; as we show in the next section, for real-world
markets it is necessary to look deeper into the LOB,
to go beyond the top of the LOB.
3 FRAGILITY OF THE LOB-TOP
For brevity, we will limit ourselves here to presenting
a qualitative illustrative example which demonstrates
how wildly fragile the Church & Cliff method is. For
a longer and more detailed discussion, see Chapter 3
of (Zhang, 2020b).
Consider a situation in which the top of the LOB
has a best bid price of $10 and a best ask price of $20,
as before, and where the quantity at the best bid is
200 and at the best ask is 1. As we explained in the
previous section, this huge imbalance between supply
and demand at the top of the book indicate that the
excess demand is likely to push transaction prices up
in the immediate future. Church & Cliffs ISHV does
the right thing in this situation.
Now consider what happens if the next order to
arrive at the exchange is a bid for $11 at a quantity of
1. Because this fresh bid is at a higher price than the
current best bid, it is inserted at the top of the bid-side
of the LOB. The previous best-bid, for 200 at $10,
gets shuffled down to the second layer of the LOB.
At that point, the best bid and the best ask each show
a quantity of one, and so ISHV acts as if there is no
imbalance in the market, despite the fact when view-
ing the whole LOB it is clear that the quantity bid is
now 201 (i.e. 1 at $11 and 200 at $10) while the ask
quantity is still only 1: if anything, the imbalance has
increased but ISHV reacts as if it had disappeared be-
Market Impact in Trader-agents: Adding Multi-level Order-flow Imbalance-sensitivity to Automated Trading Systems
429
cause ISHV looks only at the top of the LOB.
There is more that could be said, but this should
be enough to convince the reader that any impact-
sensitive trader-agent algorithm that looks only at the
data at the top of the LOB is surely going to get it
wrong very often, because it is ignoring the supply
and demand information, the quantities and the prices,
which lie deeper in the LOB. What we introduce in
the rest of this paper addresses this problem.
4 MEASURING IMBALANCE
A reliable metric is needed to capture the quantity
imbalance between the supply side and the demand
side, at multiple levels in the LOB (i.e., not just the
top) and which can quantitatively indicate how much
the imbalance will affect the market. We first discuss
the Order-Flow Imbalance (OFI) metric introduced by
(Cont et al., 2014) and then describe the extension
of this into a reliable Multi-Level OFI (MLOFI) met-
ric very recently reported by (Xu et al., 2019). After
that, we show how MLOFI can be used to give robust
impact-sensitivity to ISHV (Church and Cliff, 2019),
AA (Vytelingum, 2006; Vytelingum et al., 2008), and
ZIP (Cliff, 1997). AA and ZIP are of particular inter-
est because in previous papers published at IJCAI and
at ICAART it was demonstrated that these two trader-
agent strategies can each reliably outperform human
traders (Das et al., 2001; De Luca and Cliff, 2011b;
De Luca and Cliff, 2011a; De Luca et al., 2011).
4.1 Order Flow Imbalance (OFI)
Cont et al. argued that previous studies modelling im-
pact are extremely complex, and that instead a single
factor, the order flow imbalance (OFI), can adequately
explain the impact (R
2
= 67% in their research) (Cont
et al., 2014). They indicated that OFI has a positive
linear relation with mid-price changes, and that the
market depth D is inversely proportional to the scope
of the relationship. OFI means the net order flow at
the bid-side and the ask-side, and the market depth,
D, represents the size at each bid/ask quote price.
To calculate the OFI they focused on the “Level 1
order book”, i.e. the best bid and ask at the top of the
LOB. Between any two events (event
n
and event
n1
),
only one change happens in the LOB (check the con-
dition from top to bottom, and from left to right; in
other words, we should compare the change of price
first and if the price does not change, then compare
the change of quantity). Using D and D to respec-
tively denote an increase and a reduction in demand;
and S and S to denote an increase/decrease in sup-
ply, Cont et al. had:
p
b
n
> p
b
n1
q
b
n
> q
b
n1
= D
p
b
n
< p
b
n1
q
b
n
< q
b
n1
= D
p
a
n
< p
a
n1
q
a
n
> q
a
n1
= S
p
a
n
> p
a
n1
q
a
n
< q
a
n1
= S
Where p
b
is the best bid price; q
b
the size of the best
bid price; p
a
the best ask price; and q
a
the size of
the best ask price. The variable e
n
is defined to mea-
sure this tick change between two events, (event
n
and
event
n1
), shown in Equation 2, where I can be re-
garded as a Boolean variable.
e
n
= I
{p
b
n
>p
b
n1
}
q
b
n
I
{p
b
n
p
b
n1
}
q
b
n1
I
{p
a
n
<p
a
n1
}
q
a
n
+ I
{p
a
n
p
a
n1
}
q
a
n1
(2)
The rules for I are as follows, and only one of them
will happen between any two consecutive events:
1. if p
b
increases, e
n
= q
b
n
2. if p
b
decreases, e
n
= q
b
n1
3. if p
a
increases, e
n
= q
a
n1
4. if p
a
decreases, e
n
= q
a
n
5. if p
b
remains same and q
b
n
6= q
b
n1
, e
n
= q
b
n
q
b
n1
6. if p
a
remains same and q
a
n
6= q
a
n1
, e
n
= q
a
n1
q
a
n
If N(t
k
) is the number of events during [0, t
k
],
then OFI
k
refers to the cumulative effect of e
n
that
has occurred over the time interval [t
k
1, t
k
], as
shown in Equation 3.
OFI
k
=
N(t
k
)
n=N(t
k1
)+1
e
n
(3)
After this, a linear regression equation can be built,
per Equation 4, where P
k
= (P
k
P
k1
)/δ and δ is
the tick size (1 cent in Cont et al.s experiments), β is
the price impact coefficient, and ε
k
is the noise term
mainly caused by contributions from lower levels of
the LOB:
P
k
= βOFI
k
+ ε
k
(4)
Moreover, Cont et al. stated that the market depth, D,
is an important contributing factor to the fluctuations,
and is inversely proportional to mid-price changes.
They defined the average market depth, AD
k
, in the
“Level 1 order book” as shown in Equation 5; and β
can be measured by AD
k
, shown in Equation 6, where
λ and c are constants and v
k
is a noise term.
AD
k
=
1
2(N(T
k
) N(T
k
) 1)
N(T
k
)
N(T
k1
)+1
(q
B
n
+q
A
n
) (5)
β
k
=
c
AD
λ
k
+ v
k
(6)
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
430
Given equations 4 and 6, the relationship between
P and OFI and AD is constructed as seen in
Equation 7, according to which, Cont et al. ran the
linear regression by using the 21-trading-day data
from 50 randomly chosen US stocks, and the average
R
2
= 67%. They demonstrated that OFI is positive
in relation to the change of mid-price. If OFI > 0,
meaning a net inflow on the bid side or a net outflow
on the ask side, the mid-price has a significantly
increasing momentum, and the higher OFI is, the
more the mid-price will increase. Conversely, if
OFI < 0, meaning a net outflow on the bid side
or a net inflow on the ask side, the mid-price has a
significantly decreasing momentum, and the lower
OFI is, the more the mid-price will decrease.
P
k
=
c
AD
λ
k
OFI
k
+ ε
k
(7)
OFI is clearly a useful metric, but it operates only on
values found at the top of the LOB, i.e. the best bid
and ask. In that sense, it is as sensitive to changes
at the top of the book as is the Church & Cliff
m
metric. Next we describe how OFI can be extended
to be sensitive to values at multiple levels in the LOB,
which gives us Multi-Level OFI, or MLOFI.
4.2 Multi-level Order Flow Imbalance
Fortunately, (Xu et al., 2019) demonstrated how to
measure multi-level order flow imbalance (MLOFI).
A quantity vector, v, is used to record the OFI at
each discrete level in the LOB: see Equation 8,
where m denotes the depth of price level in the LOB.
The level-m bid-price refers to the m-highest prices
among bids in the LOB, and the level-m ask-price
refers to the m-lowest prices among asks in the LOB.
v =
MLOFI
1
MLOFI
2
...
MLOFI
m
(8)
The time when an n
th
event occurs is denoted by τ
n
;
p
m
b
(τ
n
) signifies the level-m bid-price; p
m
a
(τ
n
) denotes
the level-m ask-price; q
m
b
(τ
n
) refers to the total quan-
tity at the level-m bid-price, and q
m
a
(τ
n
) refers to the
total quantity at the level-m ask-price.
Similar to the OFI defined in Section 4.1, the
level-m OFI between two consecutive events occur-
ring at times τ
s
and τ
n
(s = n 1) can be calculated
as follows:
W
m
(τ
n
) =
q
m
b
(τ
n
), if p
m
b
(τ
n
) > p
m
b
(τ
s
)
q
m
b
(τ
n
) q
m
b
(τ
s
), if p
m
b
(τ
n
) = p
m
b
(τ
s
)
q
m
b
(τ
m
), if p
m
b
(τ
n
) < p
m
b
(τ
s
)
(9)
and
V
m
(τ
n
) =
q
m
a
(τ
m
), if p
m
a
(τ
n
) > p
m
a
(τ
s
)
q
m
a
(τ
n
) q
m
a
(τ
s
), if p
m
a
(τ
n
) = p
m
a
(τ
s
)
q
m
a
(τ
n
), if p
m
a
(τ
n
) < p
m
a
(τ
s
)
(10)
where W
m
(τ
n
) measures the order flow imbalance of
the bid side in the level-m and V
m
(τ
n
) measures the
order flow imbalance of the ask side in the level-m.
From equations 9 and 10, we can get the MLOFI
in the level-m over the time interval [t
k
1,t
k
]:
MLOFI
m
k
=
{n|t
k1
<τ
n
<t
k
}
e
m
(τ
n
) (11)
where
e
m
(τ
n
) = W
m
(τ
n
) V
m
(τ
n
) (12)
We now give four illustrative examples of the
MLOFI mechanism in action. Figure 2 shows the sit-
uation of the LOB at time t
k1
, and there is only one
event that occurs during the time interval [t
k1
,t
k
], and
here we’ll only consider the 3-level OFI.
Figure 2: The LOB at time t
k1
.
4.2.1 Case 1: New Order at Level-1 of the LOB
A new buy order comes into the LOB and occupies
the best-bid position shown in Figure 3.
Figure 3: The LOB at time t
k
: a new buy order comes.
Level-1: since p
1
b
(t
k
) > p
1
b
(t
k1
) (i.e. 93 > 90),
MLOFI
1
k
= q
1
b
(t
k
) = 5;
Level-2: since p
2
b
(t
k
) > p
2
b
(t
k1
) (i.e. 90 > 87),
MLOFI
2
k
= q
2
b
(t
k
) = 7;
Level-3: since p
3
b
(t
k
) > p
3
b
(t
k1
) (i.e. 87 > 82),
MLOFI
3
k
= q
3
b
(t
k
) = 2;
So, the quantity vector v
k
is:
v
k
=
5
7
2
(13)
All three numbers in v
k
are positive, which indicates
the upward trend of the price.
Market Impact in Trader-agents: Adding Multi-level Order-flow Imbalance-sensitivity to Automated Trading Systems
431
4.2.2 Case 2: Partial Fulfillment or Cancellation
A new sell limit order crosses the spread, or a buy
limit order at the best-bid position cancels. Figure 4
shows the resultant LOB.
Figure 4: The LOB at time t
k
: crossing the spread or a buy
order cancellation.
For the level-1, as p
1
b
(t
k
) = p
1
b
(t
k1
) (i.e. 90 = 90),
MLOFI
1
k
= q
1
b
(t
k
) q
1
b
(t
k1
) = 2 5 = 3;
For the level-2, as p
2
b
(t
k
) = p
2
b
(t
k1
) (i.e. 87 = 87),
MLOFI
2
k
= q
2
b
(t
k
) q
2
b
(t
k1
) 2 2 = 0;
For the level-3, as p
3
b
(t
k
) = p
3
b
(t
k1
) (i.e. 82 = 82),
MLOFI
2
k
= q
3
b
(t
k
) q
3
b
(t
k1
) = 0;
So, the quantity vector v
k
is:
v
k
=
3
0
0
(14)
Where 3 at Level 1 indicates a potential downward
trend for the price, because the total demand on the
bid side decreases.
4.2.3 Case 3: Full Fulfillment or Cancellation
This is similar to Case 2, but (as illustrated in Fig-
ure 5) assumes that all orders at Level 1 in the ask
book (A
1
) are transacted by an incoming buy order, or
that the order in A
1
is cancelled. In this case, we need
to consider the change on the ask side:
Figure 5: The LOB at time t
k
: crossing the spread or a sell
order cancellation.
A
1
: p
1
a
(t
k
) > p
1
a
(t
k1
) (i.e. 98 > 95),
= V
1
(t
k
) = q
1
a
(t
k1
) = 3; MLOFI
1
k
=
V
1
(t
k
) = 3;
A
2
: p
2
a
(t
k
) > p
2
a
(t
k1
) (i.e. 100 > 98),
= V
1
(t
k
) = q
2
a
(t
k1
) = 5; MLOFI
2
k
=
V
1
(t
k
) = 5;
A
3
: p
3
a
(t
k
) > p
3
a
(t
k1
) (i.e. 105 > 100),
= V
1
(t
k
) = q
2
b
(t
k1
) = 1; MLOFI
3
k
=
V
1
(t
k
) = 1;
So, the quantity vector v
k
shown in Equation 15
demonstrates that if the supply reduces or a buy has
sufficient interest to transact, the price tends to go up.
v
k
=
3
5
1
(15)
4.2.4 Case 4: New Order at Level-m of the LOB
Figure 6: The LOB at time t
k
: crossing the spread or a sell
order cancellation.
Assuming now that a new large-sized order comes to
the level-2 ask, if we only consider order flow imbal-
ance in the top level of the LOB, we cannot detect this
new block order. This is the reason why we choose to
use MLOFI.
As there is no change in the level-1 bid,
MLOFI
1
k
= 0. Because a new order comes to the
second-level bid, p
2
b
(t
k
) > p
2
b
(t
k1
) (i.e. 89 > 87) and
MLOFI
2
k
= q
2
b
(t
k
) = 100. Based on the same rule,
MLOFI
3
k
= q
3
b
(t
k
) = 2. So, the quantity vector v
k
is:
v
k
=
0
100
2
(16)
If we only care about first-level order flow imbalance,
we get OFI = 0. However, if we consider second and
third levels, we get MLOFI
2
k
= 100 and MLOFI
3
k
= 2,
which indicate a huge surplus on the demand side. If
a trader can obtain this information and take action
accordingly, it may result in larger profits or smaller
losses.
5 ZZIAA: AA TRADERS WITH
IMPACT
In this section we describe how ZZIAA is created,
by the addition of MLOFI-style imbalance-sensitivity
to the original AA trader strategy. Our intention for
ZZIAA was to develop an “impact-sensitive” module
that is not deeply embedded into the original AA so
that, if successful, this relatively independent mod-
ule could also easily be applied to other trading al-
gorithms. For this reason we chose the Widrow-Hoff
delta rule to update the quote of the ZZIAA towards
an impact-sensitive quote, as shown in Equation 17.
The p
AA
(t + 1) is derived from the long-term and
short-term factors using the information at time t (see
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
432
(Vytelingum et al., 2008)), and τ(t) is the target price
computed with consideration of MLOFI:
p
IAA
(t + 1) = p
AA
(t + 1) + (t) (17)
where
(t) = β(τ(t) p
AA
(t + 1)) (18)
and
τ(t) = p
benchmark
(t) + o
offset
(t) (19)
The core of the IAA derivation is how to find τ(t),
which consists of two parts, the benchmark price
p
benchmark
(t) and o
offset
(t). The p
benchmark
(t) de-
pends on whether the mid-price exists. As Equa-
tion 20 shows, if the mid-pice is available, we can set
p
benchmark
(t) as the mid-price, but if it is not, we set
p
benchmark
(t) as p
AA
(t + 1), which can be obtained at
time t.
p
benchmark
(t) =
p
mid
(t), if p
mid
p
AA
(t + 1), if @p
mid
(20)
The o
offset
(t) is derived from the MLOFI and the av-
erage depth. Assume that we consider M numbers of
levels MLOFI in the LOB, shown in Equation 21, and
that each MLOFI captures the last N events shown in
Equation 22.
a(t) =
MLOFI
1
(t)
MLOFI
2
(t)
...
...
MLOFI
M
(t)
(21)
where
MLOFI
M
(t) =
N
n=1
e
m
n
(22)
We can define the average market depth for m levels
in a similar way:
d(t) =
AD
1
(t)
AD
2
(t)
...
...
AD
M
(t)
(23)
where:
AD
M
(t) =
1
N
N
n=1
q
a
M
n
+ q
b
M
n
2
(24)
Knowing the quantity vector a(t), we need a
mechanism to switch this vector to a scalar. Sim-
ilar to Equation 7, we define the offset as Equation 25.
v
offset
=
i=m1
i=0
α
i
c MLOFI
(i+1)
(t)
AD
(i+1)
(t)
(25)
where α is the decay factor (initialized as 0.8) and c
is a constant (we use c = 5). Note: if AD
m
(t) = 0, the
item α
m1
cMLOFI
m
(t)
AD
m
(t)
will not be counted.
To summarise, our work extends AA by the novel
introduction of prior contributions to the economet-
rics of LOB imbalance from Cont et al. and of Xu et
al. in the following ways:
Cont et al. and Xu et al. run linear regressions to
build their model and use statistical methods to
test the significance of factors. The constants such
as c come from modelling real-world data. How-
ever our version does not run a linear regression
and the constants such as c and α are determined
based on previous studies (Cont et al., 2014; Xu
et al., 2019). We can check the model’s perfor-
mance by exploring different values of constants.
In the prior work, MLOFI
M
(t) and AD
M
(t) are
influenced by the events within a specified time
interval. In contrast, in our work, MLOFI
M
(t) and
AD
M
(t) are calculated based on the last N events
that occurred in the LOB, regardless of length of
the time interval between successive events.
6 RESULTS
Because our MOLFI-based “impact sensitive” mod-
ule added to AA was deliberately developed in a non-
intrusive way, it can easily be replicated into any
other algorithm. In this section we first show re-
sults from ZZIAA and then we follow those with re-
sults from adding the MLOFI module to ISHV (giv-
ing ZZISHV), and to ZIP (giving ZZIZIP). Because
of space limitations, the performance comparisons
shown here focus on situations where the imbalance
would cause a problem for the non-imbalance sensi-
tive versions of the trader agents and we demon-
strate that our extended trader agents are indeed supe-
rior. Extensive sets of further results are presented in
(Zhang, 2020b), which demonstrate that the extended
trader-agents perform the same as the unextended ver-
sions in situations where there is no imbalance to be
concerned about in the LOB.
For each A:B comparison we ran 100 trials in
BSE (BSE, 2012), the same open-source simulator
of a financial exchange that was used by Church &
Cliff. Each trial involved creating a market where
there were N traders of type A (e.g., ZIP) and N
traders of type B (e.g., ZZIZIP) who were allocated
the role of buyers, and similarly N of type A and
N of type B who were allocated the role of sellers.
Thus one market trial involved a total of 4N trader-
agents: for the results presented here we used N = 10.
As is entirely commonplace in all such experimental
work, buyers were issued with assignments of cash,
and sellers with assignments of items to sell, and each
trader was given a private limit price: the price below
which a seller could not sell and above which a buyer
could not buy. The distribution of limit prices in the
market determines that market’s supply and demand
curves, and the intersection of those two curves indi-
Market Impact in Trader-agents: Adding Multi-level Order-flow Imbalance-sensitivity to Automated Trading Systems
433
cates the competitive equilibrium price that microe-
conomic theory tells us to expect transaction prices to
converge to.
Although very many of the previous trader-agent
papers that we have cited here have monitored the ef-
ficiency of the traders’ activity in the market, we in-
stead monitored profitability (which only differs from
efficiency by some constant coefficient). Each indi-
vidual market trial would allow the traders to interact
via the LOB-based exchange in BSE for a fixed pe-
riod of time, and at the end of the session the average
profit of the Type A traders would be recorded, along
with the average profit of the Type B traders. In the
results presented here we conducted 100 independent
and identically distributed market trials for each A:B
comparison, giving us 100 pairs of profitability fig-
ures. To summarise those results we plot as box-and-
whisker charts the distribution of profitability values
for traders of Type A, the distribution of profitabil-
ity values for traders of Type B, and the distribution
of profitability-difference values (i.e., for each of the
100 trials, for trial t compute the difference between
the profitability of Type A traders and the profitability
of Type B traders in trial i). To determine whether the
differences we observed were statistically significant,
we used the Wilcoxon-Mann-Whitney U Test.
6.1 ZZIAA
Figure 7 summarises the comparison data generated
between AA and ZZIAA. In the U test, when com-
paring the ZZIAA with AA, p = 0.007 which meant
that the profit difference between ZZIAA and AA was
statistically significant.
6.2 Comparison of ZZISHV and ISHV
We can see from Figure 8 that the profit generated by
ZZISHV was much greater than ISHV. However, this
only means that ZZISHV is better than ISHV under
this particular market condition, and this might not be
the case under other market conditions. In the test, the
outperformance of ZZISHV can easily be explained:
as a seller, when ISHV met favourable imbalances,
it worked like SHVR and posted a price one penny
lower than the current best ask; in contrast, under the
same condition, ZZISHV chose to set price p higher
than the current best ask and seek for transaction op-
portunities some time later. For example, assume that
the current best ask is 70 and ISHV will post an or-
der with the price equal to 69. Assume that ZZISHV
gets the offset value equal to 20 from the “impact-
sensitive” module, and the quoted price will be 90.
The aim of both ISHV and ZZISHV is the same:
Figure 7: Profit distributions from original AA tested
against ZZIAA.
Figure 8: Performance of ZZISHV and ISHV when facing
large-sized orders from the bid side.
to be sensitive to imbalances in the market. The
former uses a function that maps from
m
to
s
to
achieve this objective and
m
is generated based on
the mid- and micro-prices in the market. In con-
trast, the latter uses MLOFI to achieve the goal. The
biggest difference between ISHV and ZZISHV is that
ISHV can only be sensitive to imbalances at the top of
the LOB and the MLOFI mechanism helps ZZISHV
to be sensitive to m-level imbalances on the LOB and
thus detect them earlier than ISHV in some cases. The
drawback comes in determination of appropriate pa-
rameter values for both ISHV and ZZISHV, where
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
434
trial-and-error is the best current option. In the map
function of ISHV (s = Cp ± M.m.p if the im-
balance is significant), the parameters C and M were
somewhat arbitrarily set by (Church and Cliff, 2019)
to C=2 and M=1. For ZZISHV, when quantifying
MLOFI, we use Equation 25, and the key parameter
c and decay factor α are artificially determined. We
set m = 5 (consistent with the result from (Cont et al.,
2014)) and α = 0.8. The optimal values of these pa-
rameters are not known; poor choices of these con-
stants may cause agents to perform badly.
6.3 Comparison of ZZIZIP and ZIP
ZZIZIP is ZIP with the addition of the MLOFI mod-
ule. In the example we present here, sellers will face
an excess imbalance from the demand side. The box
plots in Figure 9 illustrate the results: ZZIZIP has less
variance than ZIP and their median profitability was
slightly higher than that of ZIP; in the second figure,
we can see that although there were some outliers on
both the top and bottom, and the bottom whisker was
located below zero, the whole box was distributed be-
yond zero. Employing the U Test, we got p = 0.002
and can therefore conclude that the profit generated
by ZZIZIP was statistically significantly greater than
ZIP. Despite this, it is worth noting that the average
difference in profitability is less than half of the dif-
ference between AA and IAA, given that other con-
ditions remain unchanged. So, our next question is:
what causes the smaller difference in profits between
ZZIZIP and ZIP?
Figure 9: Performance of ZIP and ZZIZIP when facing
large-sized orders from the bid side.
To answer this, we need to examine how ZIP works.
ZIP uses the Widrow-Hoff Delta rule to update its
next quote-price towards its current target price. The
current target price is based on the last quote price
in the market. Due to this, the last quote price af-
fects the bidding behaviour of ZIP considerably. In
this test, on the ask side, the 10 ZIP sellers were not
impact-sensitive and the 10 ZZIZIP sellers were. But,
although the ZIP traders were not themselves impact-
sensitive, they were affected by the quote prices com-
ing from the ZZIZIP active in the same market, and
so the ZIPs’ quote prices approached the ZZIZIPs’
to some extent. In other words, this adaptive mech-
anisms within the non-impact-sensitive ZIP gave it
a degree of impact-sensitivity, because it was influ-
enced by the activities of the impact-sensitive traders
in the market. In the test, if we treat ZZIZIP and ZIP
as a group, the average profit generated is 84.82 (95%
CI: [82.16, 87.48]). If we replace 10 ZZIZIPs with 10
ZIPs (total 20 ZIP sellers), the average profit of ZIP is
79.21 (95% CI: [77.11, 81.31]). With the presence of
ZZIZIP, all sellers tend to make more profit.
7 DISCUSSION & CONCLUSION
We know of no paper prior to (Church and Cliff,
2019) in which trader-agents are given a sensitivity
to quantity imbalances between the bid and ask sides
of the LOB. Such imbalances are often (but not al-
ways) caused by the arrival of one or more block
orders on one side of the LOB. In this paper we
have provided a constructive critique of Church &
Cliffs method, pointing out the extreme fragility of
imbalance-sensitivity metrics like theirs that moni-
tor only the top of the LOB. We then explained the
OFI and MLOFI metrics of (Cont et al., 2014) and
(Xu et al., 2019) respectively, and demonstrated how
MLOFI could be integrated within Vytelingum’s AA
trading-agent strategy to give ZZIAA. We demon-
strated that ZZIAA performs extremely well: it per-
forms the same as AA when there is no imbalance,
and significantly outperforms AA in the presence of
major LOB imbalance. We then showed how the
imbalance-sensitivity mechanisms that we developed
for ZZIAA can readily be incorporated into other
trading-agent algorithms such as ZIP (Cliff, 1997)
and SHVR (Cliff, 2018). Results from ZZIZIP and
ZZISHV are similarly very good and further demon-
strate that the mechanisms developed here have given
robust imbalance-sensitivity to a range of trader-agent
strategies. In future work we intend to explore the
addition of MLOFI-based impact-sensitivity to con-
temporary adaptive trader-agents based on deep learn-
ing neural networks (le Calvez and Cliff, 2018; Wray
et al., 2020). Complete details of the work described
here are given in (Zhang, 2020b) and all of our rel-
Market Impact in Trader-agents: Adding Multi-level Order-flow Imbalance-sensitivity to Automated Trading Systems
435
evant source-code for the system described here has
been made freely available as open-source code on
GitHub (Zhang, 2020a), enabling other researchers to
examine, replicate, and extend our work.
REFERENCES
BSE (2012). Bristol Stock Exchange open-source fi-
nancial exchange simulator. GitHub repository:
https://github.com/davecliff/BristolStockExchange.
Cartea, A., Jaimungal, S., and Penalva, J. (2015). Algorith-
mic and High-Frequency Trading. Cambridge Univer-
sity Press.
Church, G. and Cliff, D. (2019). A simulator for studying
automated block trading on a coupled darks/lit finan-
cial exchange with reputation tracking. Proceedings of
the European Modelling and Simulation Symposium.
Cliff, D. (1997). Minimal-intelligence agents for bargain-
ing behaviors in market-based environments. Hewlett-
Packard Labs Technical Report HPL-97-91.
Cliff, D. (2018). An open-source limit-order-book ex-
change for teaching and research. In Proceedings of
the IEEE Symposium Series on Computational Intelli-
gence (SSCI-2018), pages 1853–1860.
Cont, R., Kukanov, A., and Stoikov, S. (2014). The price
impact of order book events. Journal of Financial
Econometrics, 12(1):47–88.
Das, R., Hanson, J., Tesauro, G., and Khephart, J. (2001).
Agent-human interactions in the continuous double
auction. Proceedings IJCAI-2001, pages 1169–1176.
De Luca, M. and Cliff, D. (2011a). Agent-human interac-
tions in the continuous double auction, redux. Pro-
ceedings ICAART-2011.
De Luca, M. and Cliff, D. (2011b). Human-agent auc-
tion interactions: Adaptive-aggressive agents domi-
nate. Proceedings IJCAI-2011.
De Luca, M., Szostek, C., Cartlidge, J., and Cliff, D. (2011).
Studies of interactions between human traders and al-
gorithmic trading systems. Driver Review 13, UK
Government Office for Science, Foresight Project on
the Future of Computer Trading in Financial Markets.
http://bit.ly/RoifIu.
Gjerstad, S. (2003). The impact of pace in double auc-
tion bargaining. Working paper, Department of Eco-
nomics, University of Arizona.
Gjerstad, S. and Dickhaut, J. (1997). Price formation in
continuous double auctions. Games and Economic
Behavior, 22(1):1–29.
Gode, D. K. and Sunder, S. (1993). Allocative efficiency
of markets with zero-intelligence traders. Journal of
Political Economy, 101(1):119–137.
le Calvez, A. and Cliff, D. (2018). Deep learning can
replicate adaptive traders in a limit-order-book finan-
cial market. In Proceedings of the IEEE Symposium
Series on Computational Intelligence (SSCI-2018),
pages 1876–1883.
London Stock Exchange Group (2019). Turquoise trading
service. Description Version 3.35.5.
Pentapalli, M. (2008). A comparative study of Roth-Erev
and Modified Roth-Erev reinforcement learning algo-
rithms for uniform-price double auctions. PhD thesis,
Iowa State University.
Petrescu, M. and Wedow, M. (2017). Dark pools in eu-
ropean equity markets: emergence, competition and
implications. ECB Occasional Paper, 193.
Rust, J., Palmer, R., and Miller, J. H. (1992). Behaviour
of trading automata in a computerized double auction
market. In The Double Auction Market: Theories and
Evidence, pages 155–198. Addison Wesley.
Snashall, D. and Cliff, D. (2019). Adaptive-aggressive
traders don’t dominate. In van den Herik, J., Rocha,
A., and Steels, L., editors, Agents and Artificial Intel-
ligence: Selected Papers from ICAART 2019, pages
246–269. Springer.
Tesauro, G. and Bredin, J. L. (2002). Strategic sequential
bidding in auctions using dynamic programming. In
Proc. First Int. Joint Conf. on Autonomous Agents and
Multiagent Systems: part 2, pages 591–598.
Tesauro, G. and Das, R. (2001). High-performance bidding
agents for the continuous double auction. Proc. 3rd
ACM Conference on E-Commerce, pages 206–209.
Vytelingum, K., Cliff, D., and Jennings, N. (2008). Strate-
gic bidding in continuous double auctions. Artificial
Intelligence, 172(14):1700–1729.
Vytelingum, P. (2006). The structure and behaviour of the
Continuous Double Auction. PhD thesis, University
of Southampton.
Wray, A., Meades, M., and Cliff, D. (2020). Automated cre-
ation of a high-performing algorithmic trader via deep
learning on level-2 limit order book data. In Proceed-
ings of the IEEE Symposium Series on Computational
Intelligence (SSCI-2020).
Xu, K., Gould, M., and Howison, S. (2019). Multi-level
order-flow imbalance in a Limit Order Book. SSRN
3479741.
Zhang, Z. (2020a). GitHub repository: https://github.com/
davecliff/BristolStockExchange/tree/master/
ZhenZhang.
Zhang, Z. (2020b). An impact-sensitive adaptive algorithm
for trading on financial exchanges. Master’s thesis,
University of Bristol Dept. of Computer Science.
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
436