OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS
A Linear Programming Approach to the Delivery of On-line Advertisements
Fabrizio Caruso
1
and Giovanni Giuffrida
2
1
Neodata, Catania, Italy
2
Department of Social Sciences, University of Catania, Catania, Italy
Keywords:
Web advertisement, Linear programming, Data mining, Machine learning.
Abstract:
We find an optimal strategy for displaying advertisements in given locations at given times under some realistic
dynamic constraints. Our goal is to maximize the total profit produced by the impressions, which depends on
profit-generating events such as the impressions themselves and the ensuing clicks. We must take into account
the possibility that the constraints could change over time in a way that cannot always be foreseen.
1 INTRODUCTION
We want to find the optimal strategy for displaying ad-
vertisements in order to achieve different goals (max-
imum total profit, maximum visibility of the cam-
paign, etc.) under some realistic constraints. We need
to find the optimal number of “impressions” (display
of advertisements at a given time and at a given “lo-
cation”) under some realistic dynamic constraints that
both limit the possibility of certain creatives and limit
the number of impressions in certain locations and/or
moments in time. A location is a place where an ad-
vertisement can be displayed. This model can be gen-
eralized to target users by their categories, as well.
Similar optimization problems have been considered
in the literature (Vee et al., 2010; Alaei et al., 2009;
Abrams et al., 2007; Langheinrich et al., 1999; Abe
and Nakamura, 1999; Nakamura, 2002). Other ap-
proaches have also been considered by the authors of
this article (Caruso et al., 2011), where a Bayesian
model is used.
Our approach improves over the previous ones in
different respects:
we consider a more realistic model (realistic con-
straints of different nature);
we formally investigate the problem of consis-
tency of the constraints;
we consider an optimization of the problem by ap-
proximating it to a problem with fewer unknowns;
we can apply machine learning techniques for
guessing the traffic on locations.
We are given certain “creatives” (advertisements,
e.g. banners, videos, etc.), “campaigns” (sets of re-
lated creatives), certain “locations” and a period of
time (set of “time frames”). At a given moment in
time we have an expected profit for each creative of a
given campaign in a given time and location.
The profit of the web-page’s owner depends on the
profit-generating events that have been agreed upon
by the advertiser and the web-page’s owner. These
events can be the impression itself, a click on the ad-
vertisement or a registration of any sort (e.g. registra-
tion into the advertised site, purchase of the advertised
item, etc.), or any combinations of these events. We
denote the expected profit of a single “impression” as
the “impression profit” (the expected profit of a sin-
gle impression obtained by all the profit-generating
events such as the impression itself, the ensuing click
and registrations of all types). In such a way we can
avoid keeping track of click-through rates and differ-
ent registration rates. This choice is a compromise
between performance and generality, since it makes
our model less precise and slightly less general: we
are not considering campaigns with separate budgets
for different events; we cannot estimate the expected
profit of an impression as precisely as when different
rates for different events are considered.
The number of impressions on a given location at
a given time is limited by the traffic (“supply”) of the
corresponding webpage. It also depends on time in
a way that can be only partially predicted. Moreover
the maximum profit for a given campaign (“demand”)
may be limited by a predefined budget.
Our goal is to maximize our expected revenue
400
Caruso F. and Giuffrida G..
OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of On-line Advertisements.
DOI: 10.5220/0003805504000405
In Proceedings of the 4th International Conference on Agents and Artificial Intelligence (ICAART-2012), pages 400-405
ISBN: 978-989-8425-95-9
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
which is given by the expected total price paid.
Therefore we wish to maximize a weighted sum
of all expected profits obtained in all locations in the
period of time under consideration.
Taking into account only supply and demand con-
straints makes our model a special instance of a
“transportation problem” for which very efficient so-
lutions exist (see (Dantzig, 1963)).
The complexity of the model brings up the addi-
tional problem of deciding between simplifying the
model and considering smaller problems.
In order to apply our optimization we need to
make a projection of the future supply and a projec-
tion of the impression profits onto our period. Im-
pressions are only possible on the locations and times
allowed by the scheduling. The projection of the im-
pression profits should also try to “guess” how the
profit of an impression changes in time. The projec-
tion algorithms should take into account different pe-
riodicities (e.g., daily, weekly). The projections can
be improved by applying machine learning techniques
to compute the weights of the periodicities.
Moreover we cannot assume the immutability of
the constraints of the problem in the period of time
under consideration. For this reason we have to con-
tinuously readjust to new conditions.
2 NOTATION
We denote by C
i
the i-th campaign (set of creatives)
and by B
i, j
its j-th creative, by L
l
the l-th location, by
T
k
the k-th time frame.
We denote the “impression count” by x
i, j,k,l
, i.e.,
the number of impressions of B
i, j
at time frame T
k
and at location L
l
. For example x
2,3,1,5
= 10 means
that banner B
2,3
is displayed ten times at time T
1
at
location L
5
.
We denote by p
i, j,k,l
the impression profit”, the
profit of B
i, j
at location L
l
and a time T
k
.
2.1 Configurations
We can consider our problem as the problem of find-
ing the optimal impression counts for the entries in
a tridimensional matrix, i.e. the points in a discrete
finite space given by a grid defined by couples (cam-
paign, creative), time and location.
We refer to a single point (i, j, k, l) in this tridi-
mensional discrete finite space as an “impression-
event” (or simply an “impression” when this is clear
from the context). We call any choice for the values
of x
x, j,k,l
of all the impression-events as a “configura-
tion”. An impression is in fact characterized by a cou-
ple (campaign, creative), a location and a time. Our
goal is to choose the optimal delivery of each possi-
ble impression, i.e. an optimal configuration. We will
simply refer to the the number of impressions of an
impression-event as the “impression count”.
We refer to the points that are allowed by the
schedule as to “admissible points”. Each admissi-
ble point describes a dimension of our optimization
problem. The worst case is produced when all points
inside the cube of size given by the number of cou-
ples (campaign, creative), the number of time frames,
the number of locations, are admissible. Hence in the
worst case the number of dimensions is the product of
the number of couples (campaign,creative), the num-
ber of locations and the number of time frames con-
sidered.
Remark 1. In practice we need to translate the num-
ber of impressions x
i, j,k,l
in terms of probability of de-
livery. We transform a configuration into a map
(i, j, k, l) 7→ Prob
k,l
(i, j)
where Prob
k,l
(i, j) is the computed probability of de-
livery of B
i, j
at location L
l
and time T
k
, in that we take
the ratio between x
i, j,k,l
and the expected supply S
l,k
at L
l
and time T
k
.
Pictorially, we could see a single configuration
C = (x
i, j,k,l
)
i, j,k,l
as a tridimensional matrix:
-
location
6
time frame
(campaign, creative)
(i, j)
k
x
i, j,k,l
a
l
3 REALISTIC MODEL
We want to consider a realistic model in which several
constraints of different nature are taken into account.
3.1 The Constraints
We distinguish between the primary (physical) con-
straints of the problem, the secondary ones (commer-
OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of
On-line Advertisements
401
cial and optional) and the learning constraints (re-
quired by the learning phase if it is included in the
mathematical model).
3.2 Primary Constraints
The primary constraints are given by the schedule of
the campaigns, by the limited supply of impressions
and by a (possibly) limited demand (budget):
1. The scheduling of the campaigns limits the admis-
sible points: certain creatives B
i, j
are only pos-
sible at certain times and locations. Typically a
campaign begins and ends at certain times and its
creatives are limited to certain locations, hours of
the day, days of the week, etc.
2. Any location at a given time receives a limited
supply of impressions, which solely depends on
the traffic of its page;
3. For any given campaign a given total profit may
not be exceeded (“demand”) because only a finite
campaign budget can be available.
Remark 2. Campaigns can have an unbounded bud-
get, e.g. one that only pays for an actual purchase.
3.3 Secondary Constraints
The secondary constraints may be of a commercial
nature. They could be enforced in real time while
monitoring the delivery, although having them as con-
straints improves the accuracy of the model. They are
necessary to increase the visibility of a certain cam-
paign/creative:
1. Any given creative/campaign should not last less
than a given period, e.g. the period in which the
campaign is scheduled. We enforce this by set-
ting a minimum for the number of impressions for
each possible time frame.
2. We would like to avoid having only one creative at
a given location and time frame when more than
one choice is available.
3.4 Learning Constraints
We can embed some learning constraints into the
mathematical model. One way to achieve this can
be a constraint of the form: for each new couple
(creative, location) we must have a minimum num-
ber of impressions in all (or some initial) possible
time frames. This is not strictly necessary because
the same goal can be achieved by using portion of the
traffic for learning. Having these constraints inside
the model produces a more accurate model.
4 LINEAR PROGRAMMING
Under the mild hypothesis that the impression profit
p
i, j,k,l
is constant with respect to its count x
i, j,k,l
we
can assume that our constraints are linear. This as-
sumption is not true in general because there is no lin-
ear dependence between the total profit generated by
an impression-event and an impression count, i.e. dis-
playing the same advertisement x times on the same
location, possibly more than once to the same user,
does not necessarily produce x times the profit pro-
duced by one single display.
Since we are ultimately interested in the probabil-
ity of delivery and since integer linear programmingis
computationally unfeasible (NP-hard), a possible ap-
proach to this problem could be real linear program-
ming: we approximate our discrete problem with a
continuous one and we do not mind considering a real
number of impressions.
4.1 Formalized Constraints
The points that do not contradict the first primary con-
straint will be the unknowns of our model.
4.1.1 Primary Constraints
We do not include the first primary constraints for the
reasons given above and assume that in our expres-
sions all indices run over points that do not contradict
the first primary constraints.
Supply and demand are formalized as follows:
Second primary constraint:
l,k
i, j
x
i, j,k,l
S
l,k
; (supply) (1)
where S
l,k
is the supply at location L
l
and at time T
k
.
Third primary constraint:
i
j,l,k
p
i, j,k,l
x
i, j,k,l
D
i
; (demand) (2)
where D
i
is the budget of the i-th campaign.
Remark 3. If only the primary constraints are taken
into account, we have a “Hitchcock’s style trans-
portation problem” (Hitchcock, 1941). For such
problems very efficient algorithms are known such as
the “stepping stone algorithm” (Dantzig, 1963).
4.1.2 Secondary Constraints
The secondary constraints are formalized as follows:
First secondary constraint:
i,k
j,l
x
i, j,k,l
µ
i,k
; (duration) (3)
ICAART 2012 - International Conference on Agents and Artificial Intelligence
402
where µ
i,k
is the desired minimum delivery of impres-
sions of the i-th campaign at time T
k
.
Second secondary constraint:
l,kD
i, j
x
i, j,k,l
P
l,k
· S
l,k
; (no overflow) (4)
where P
l,k
[0, 1] (usually close to 1) defines how
much a single creative can occupy a location at a
given time frame and where D is the set of indices
corresponding to locations and time frames where at
least 2 different creatives are possible.
Remark 4. The second secondary constraints (4)
should only be limited to those cases in which at a
given location and time more than one pair of cam-
paign and creative is possible because otherwise the
constraint would prevent the location from being filled
with impressions even when this could be possible.
4.1.3 Learning Constraints
If the learning phase is included in the model some
constraints should force a minimum delivery for the
new creatives and new locations:
new j,l
i,k
x
i, j,k,l
λ
i, j,k,l
. (5)
We are also implicitly assuming that the un-
knowns are non-negative, i.e.
i, j,k,l
x
i, j,k,l
0. (6)
4.2 The Objective Function
We want to maximize our expected revenue, which is
given by the sum of all expected profits received in a
given configuration C:
F(C) =
i, j,k,l
p
i, j,k,l
x
i, j,k,l
. (7)
where p
i, j,k,l
x
i, j,k,l
is the expected profit generated by
B
i, j
at location L
l
and at time T
k
.
5 EXISTENCE OF A SOLUTION
We see that there is no guarantee of consistency once
the secondary and learning constraints are introduced,
even if we exclude the first primary constraints. We
need to solve the system of inequalities in order to
know if there is a solution. Nevertheless we can
use some heuristic method to avoid some inconsistent
problems. In particular we can find some conditions
on µ
i,k
in (3) and λ
i, j,k,l
in (5) under which the prob-
lem cannot have a solution.
Given a set of t-uples T, we introduce the follow-
ing notation for subsets of t 1-uples:
T[i α] = {(x
1
, . . . , x
i1
, x
i+1
, . . . , x
t
)
|(x
1
, . . . , x
i1
, α, x
i+1
, . . . , x
t
) T};
T[i ] = {(x
1
, . . . , x
i1
, x
i+1
, . . . , x
t
)
|∃v|(x
1
, . . . , x
i1
, v, x
i+1
, . . . , x
t
) T}.
i.e. we are considering respectively
(t 1)-uples obtained from t-uples in T, where
i-th component is α and it has been removed;
(t 1)-uples obtained from t-uples in T where the
i-th component has been removed independently
of its value.
In the same way, if more components are re-
moved in parallel, we introduce the notation: T[i
1
α
1
, . . . , i
n
α
n
] for (t n)-uples, where α
j
N {∗}
and i
j
N for j {1, . . . , n}. Here we will consider
the set T of all indices (i, j, k, l) that are allowed by
the first primary constraints.
Examples.
T[1 i, 2 , 4 ] = {k | j, l | (i, j, k, l) T}
T[1 i] = {( j, k, l) | (i, j, k, l) T}
T[3 k, 4 l] = {(i, j) | (i, j, k, l) T}
T[2 , 3 , 4 ] = {i | j, k, l | (i, j, k, l) T}
T[1 , 2 ] = {(k, l) | i, j | (i, j, k, l) T}
Fact 1. Let us assume T[1 i, 2 , 4 ] 6=
/
0 for
all i T[2 , 3 , 4 ] (i.e. for all possible
campaignsC
i
).
If we choose
µ
i,k
>
D
i
|T[1 i, 2 , 4 ]|m
where m = min p
i, j,k,l
, then the semi-algebraic set de-
fined by the inequalities (2), (3) is empty.
Proof. By the first secondary constraint (3) we have
j,l
x
i, j,k,l
>
D
i
|T[1 i, 2 , 4 ]|m
.
Therefore for any campaign C
i
we have
j,k,l
p
i, j,k,l
x
i, j,k,l
m
j,k,l
x
i, j,k,l
m
kT[1i,2→∗,4→∗]
j,l
x
i, j,k,l
>
>
kT[1i,2→∗,4→∗]
D
i
|T[1 i, 2 , 4 ]|
= D
i
.
OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of
On-line Advertisements
403
Fact 2. Let us assume T[1 i], T[3 k, 4 l] 6=
/
0
for all i T[2 , 3 , 4 ] (i.e. all possible
campaigns C
i
) and (k, l) T[1 , 2 ].
If all banners and locations under consideration
are new and we choose
λ
i, j,k,l
> min
D
i
|T[1 i]|m
,
S
k,l
|T[3 k, 4 l]|
where m = min p
i, j,k,l
, then the semi-algebraic set de-
fined by the inequalities (1), (2), (5) is empty.
Proof. If λ
i, j,k,l
>
D
i
|T[1i]|m
then
By (5) we have
x
i, j,k,l
>
D
i
|T[1 i]|m
from which it follows that for any campaign C
i
:
j,k,l
p
i, j,k,l
x
i, j,k,l
m
j,k,l
x
i, j,k,l
>
> m
j,k,l
D
i
|T[1 i]|m
= D
i
.
If λ
i, j,k,l
>
S
k,l
|T[3k,4l]|
then by (5) we have
x
i, j,k,l
>
S
k,l
|T[3 k, 4 l]|
from which it follows that for any couple (k, l)
T[1 , 2 ] we have
i, j
x
i, j,k,l
>
i, j
S
k,l
|T[3 k, 4 l]|
= S
k,l
.
6 FORECASTING DATA
In order to apply our optimization algorithms we need
to have at least a projection of the supply and a pro-
jection of the expected profit of all impressions al-
lowed by the first primary constraint. The supply
and impression profits can be estimated by taking a
proper weighted average from the historical data. The
projection should take into account different factors:
episodic factors and possibly different periodicities.
6.1 Projecting the Profit
The profit of an impression-event may depend on the
periodicity of its campaign and of its location. Since
the “impression profit” changes slowly in time, it can
be predicted better than the supply. If we want to de-
termine the expected profit for an impression-event
we can take some average profit from historical data
on “similar” events. Our strategy is to use the most
accurate and recent available information.
6.2 Projecting the Traffic
A model for the projection of the supply should take
into consideration the periodicity of the location, i.e.
some sites are more often visited in particular peri-
ods of the year, day, hours, etc. More periodicities
may concur, e.g. a site may be visited more often in a
specific day of the week and at a specific hour of the
day. Regression analysis through machine-learning
techniques such as support vector machines can be a
viable approach for the problem of properly choosing
the weights of the average of the different “features”.
7 FURTHER IMPROVEMENTS
This approach can be improved in its accuracy by tar-
geting the users, and in its speed by reducing the di-
mensions and constraints in the model.
7.1 Targeting Users
The approach we have presented optimizes the deliv-
ery of advertisements in both space (locations) and
time (time frames). The very same algorithms and
code can be used to take users’ profiles into ac-
count by encapsulating the profile information into
the information on the location by storing a pair
(location, profile) into a single “extended location”.
7.2 Simplifying Things
The large number of unknowns and constraints in this
general approach can pose a serious problem to its
computable feasibility. We can reduce the dimensions
and constraints by clustering similar attributes, (Abe
and Nakamura, 1999) or by simplifying our model:
We restrict our problem to periods of time in
which the time constraints do not change. This
greatly reduces the number of unknownsbutcould
produce suboptimal solutions.
We avoid secondary and learning constraints and
enforce them during the delivery.
We use a time horizon, beyond which all the time
frames are considered as a single time.
8 RESULTS ON REAL DATA
We have implemented an ad-server optimizer in
Java
. For solving the linear programming model we
use
glpk
1
. Optionally we use the freely available sup-
1
http://www.gnu.org/s/glpk/
ICAART 2012 - International Conference on Agents and Artificial Intelligence
404
port vector machines library
2
to project future supply.
Our implementation requires as input: historical
data necessary for projecting the impression prof-
its and the future supply, campaign data (budgets),
scheduling data.
Our prototype has been used on real data avail-
able at Neodata and has been compared against the
results of the currently used optimizer, which works
as follows: if a campaign is achieving its target at the
current rate, nothing is done, otherwise, it is stopped
in its less profit-generating locations.
We used logs and schedules of two clients of Neo-
data, which, we call A and B. We remark that the
traffic managed by Neodata, neither accounts for the
total traffic nor is it a constant percentage of the traffic
generated by the sites under consideration. A was op-
timized equally well by the current optimizer and our
prototype; whereas B was optimized better by a large
margin (20% - 50%) by our code. In the following
table we show the result of one of our experiments on
the data of April 30th 2010 for company B:
hour real profit opt. profit gain % gain
8:00 a.m. 2.50 4.41 1.91 76%
9:00 a.m. 3.96 8.07 4.11 104%
10:00 a.m. 6.69 12.97 6.28 94%
11:00 a.m. 14.17 23.32 9.15 65%
12:00 a.m. 14.98 24.66 9.68 65%
1:00 p.m. 15.00 14.01 -0.99 -7%
2:00 p.m. 19.43 31.81 12.38 64%
3:00 p.m. 26.07 41.14 15.07 58%
4:00 p.m. 23.38 24.37 0.99 4%
5:00 p.m. 13.98 14.40 0.42 3%
6:00 p.m. 12.64 28.74 16.10 127%
7:00 p.m. 15.90 28.38 12.48 78%
8:00 p.m. 10.55 10.89 0.34 3%
total 179.25 267.17 87.92 49%
Possible reasons why data on A are not optimized
equally well may be: there is no room for further im-
provement; the data on the supply cannot be used for
the projection because it does no correspond to a con-
stant percentage of the real traffic.
The data was used as follows: the initial portion of
the data (e.g. the first 20 days) were used for training
the system, i.e. projecting the supply (traffic) and the
profits. The remaining part of the month was used as
a schedule and was optimized.
9 CONCLUSIONS
We have shown how a linear programming approach
can be used to optimize the delivery of on-line ban-
ners. Our approach takes many different constraints
2
http://www.csie.ntu.edu.tw/˜cjlin/libsvm
into account (schedule, supply, demand, visibility,
learning). We prove that under some conditions, the
corresponding system of inequalities is consistent.
This approach can be used to target users by sim-
ply extending the concept of location. We have also
tackled the problem of dimensionality (by a time hori-
zon, by simplifying the constraints, etc.).
Our prototype has been tested on real data. We
have shown that it optimizes the delivery of on-line
advertisements better than a greedy algorithm.
There are still some open issues: how to project
the traffic when the conditions of the problem change
quickly and the data does not correspond to a constant
percentage of the traffic.
REFERENCES
Abe, N. and Nakamura, A. (1999). Learning to Optimally
Schedule Internet Banner Advertisements. In ICML
’99: Proceedings of the Sixteenth International Con-
ference on Machine Learning, pages 12–21, San Fran-
cisco, CA, USA. Morgan Kaufmann Publishers Inc.
Abrams, Z., Mendelevitch, O., and Tomlin, J. (2007). Opti-
mal delivery of sponsored search advertisements sub-
ject to budget constraints. In Proc. of the 8th ACM
conf. on Electronic commerce, EC ’07, pages 272–
278, New York, USA. ACM.
Alaei, S., Arcaute, E., Khuller, S., Ma, W., Malekian, A.,
and Tomlin, J. (2009). Online allocation of display
advertisements subject to advanced sales contracts. In
Proc. of the 3rd International Workshop on Data Min-
ing and Audience Intelligence for Advertising, AD-
KDD ’09, pages 69–77, New York, USA. ACM.
Caruso, F., Giuffrida, G., and Zarba, C. (2011). Real-time
Behavioral Targeting of Banner Advertising. In Book
of Abstracts of the CLADAG 2011.
Dantzig, G. (1963). Linear Programming and Extensions.
Princeton University.
Hitchcock, F. (1941). The distribution of a product from
several sources to numerous localities. Journal of
Mathematics and Physics, 20:224–230.
Langheinrich, M., Nakamura, A., Abe, N., Kamba, T.,
and Koseki, Y. (1999). Unintrusive customization
techniques for web advertising. Computer Networks,
31(11-16):1259–1272.
Nakamura, A. (2002). Improvements in practical aspects of
optimally scheduling web advertising. In WWW ’02:
Proc. of the 11th international conf. on World Wide
Web, pages 536–541, New York, USA. ACM.
Vee, E., Vassilvitskii, S., and Shanmugasundaram, J.
(2010). Optimal online assignment with forecasts. In
Proc. of the 11th ACM conf. on Electronic commerce,
EC ’10, pages 109–118, New York, USA. ACM.
OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of
On-line Advertisements
405