OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS

A Linear Programming Approach to the Delivery of On-line Advertisements

Fabrizio Caruso

and Giovanni Giuffrida

Neodata, Catania, Italy

Department of Social Sciences, University of Catania, Catania, Italy

Keywords:

Web advertisement, Linear programming, Data mining, Machine learning.

Abstract:

We ﬁnd an optimal strategy for displaying advertisements in given locations at given times under some realistic

dynamic constraints. Our goal is to maximize the total proﬁt produced by the impressions, which depends on

proﬁt-generating events such as the impressions themselves and the ensuing clicks. We must take into account

the possibility that the constraints could change over time in a way that cannot always be foreseen.

1 INTRODUCTION

We want to ﬁnd the optimal strategy for displaying ad-

vertisements in order to achieve different goals (max-

imum total proﬁt, maximum visibility of the cam-

paign, etc.) under some realistic constraints. We need

to ﬁnd the optimal number of “impressions” (display

of advertisements at a given time and at a given “lo-

cation”) under some realistic dynamic constraints that

both limit the possibility of certain creatives and limit

the number of impressions in certain locations and/or

moments in time. A location is a place where an ad-

vertisement can be displayed. This model can be gen-

eralized to target users by their categories, as well.

Similar optimization problems have been considered

in the literature (Vee et al., 2010; Alaei et al., 2009;

Abrams et al., 2007; Langheinrich et al., 1999; Abe

and Nakamura, 1999; Nakamura, 2002). Other ap-

proaches have also been considered by the authors of

this article (Caruso et al., 2011), where a Bayesian

model is used.

Our approach improves over the previous ones in

different respects:

• we consider a more realistic model (realistic con-

straints of different nature);

• we formally investigate the problem of consis-

tency of the constraints;

• we consider an optimization of the problem by ap-

proximating it to a problem with fewer unknowns;

• we can apply machine learning techniques for

guessing the trafﬁc on locations.

We are given certain “creatives” (advertisements,

e.g. banners, videos, etc.), “campaigns” (sets of re-

lated creatives), certain “locations” and a period of

time (set of “time frames”). At a given moment in

time we have an expected proﬁt for each creative of a

given campaign in a given time and location.

The proﬁt of the web-page’s owner depends on the

proﬁt-generating events that have been agreed upon

by the advertiser and the web-page’s owner. These

events can be the impression itself, a click on the ad-

vertisement or a registration of any sort (e.g. registra-

tion into the advertised site, purchase of the advertised

item, etc.), or any combinations of these events. We

denote the expected proﬁt of a single “impression” as

the “impression proﬁt” (the expected proﬁt of a sin-

gle impression obtained by all the proﬁt-generating

events such as the impression itself, the ensuing click

and registrations of all types). In such a way we can

avoid keeping track of click-through rates and differ-

ent registration rates. This choice is a compromise

between performance and generality, since it makes

our model less precise and slightly less general: we

are not considering campaigns with separate budgets

for different events; we cannot estimate the expected

proﬁt of an impression as precisely as when different

rates for different events are considered.

The number of impressions on a given location at

a given time is limited by the trafﬁc (“supply”) of the

corresponding webpage. It also depends on time in

a way that can be only partially predicted. Moreover

the maximum proﬁt for a given campaign (“demand”)

may be limited by a predeﬁned budget.

Our goal is to maximize our expected revenue

400

Caruso F. and Giuffrida G..

OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of On-line Advertisements.

DOI: 10.5220/0003805504000405

In Proceedings of the 4th International Conference on Agents and Artiﬁcial Intelligence (ICAART-2012), pages 400-405

ISBN: 978-989-8425-95-9

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

which is given by the expected total price paid.

Therefore we wish to maximize a weighted sum

of all expected proﬁts obtained in all locations in the

period of time under consideration.

Taking into account only supply and demand con-

straints makes our model a special instance of a

“transportation problem” for which very efﬁcient so-

lutions exist (see (Dantzig, 1963)).

The complexity of the model brings up the addi-

tional problem of deciding between simplifying the

model and considering smaller problems.

In order to apply our optimization we need to

make a projection of the future supply and a projec-

tion of the impression proﬁts onto our period. Im-

pressions are only possible on the locations and times

allowed by the scheduling. The projection of the im-

pression proﬁts should also try to “guess” how the

proﬁt of an impression changes in time. The projec-

tion algorithms should take into account different pe-

riodicities (e.g., daily, weekly). The projections can

be improved by applying machine learning techniques

to compute the weights of the periodicities.

Moreover we cannot assume the immutability of

the constraints of the problem in the period of time

under consideration. For this reason we have to con-

tinuously readjust to new conditions.

2 NOTATION

We denote by C

the i-th campaign (set of creatives)

and by B

i, j

its j-th creative, by L

the l-th location, by

the k-th time frame.

We denote the “impression count” by x

i, j,k,l

, i.e.,

the number of impressions of B

i, j

at time frame T

and at location L

. For example x

2,3,1,5

= 10 means

that banner B

2,3

is displayed ten times at time T

location L

We denote by p

i, j,k,l

the “impression proﬁt”, the

proﬁt of B

i, j

at location L

and a time T

2.1 Conﬁgurations

We can consider our problem as the problem of ﬁnd-

ing the optimal impression counts for the entries in

a tridimensional matrix, i.e. the points in a discrete

ﬁnite space given by a grid deﬁned by couples (cam-

paign, creative), time and location.

We refer to a single point (i, j, k, l) in this tridi-

mensional discrete ﬁnite space as an “impression-

event” (or simply an “impression” when this is clear

from the context). We call any choice for the values

of x

x, j,k,l

of all the impression-events as a “conﬁgura-

tion”. An impression is in fact characterized by a cou-

ple (campaign, creative), a location and a time. Our

goal is to choose the optimal delivery of each possi-

ble impression, i.e. an optimal conﬁguration. We will

simply refer to the the number of impressions of an

impression-event as the “impression count”.

We refer to the points that are allowed by the

schedule as to “admissible points”. Each admissi-

ble point describes a dimension of our optimization

problem. The worst case is produced when all points

inside the cube of size given by the number of cou-

ples (campaign, creative), the number of time frames,

the number of locations, are admissible. Hence in the

worst case the number of dimensions is the product of

the number of couples (campaign,creative), the num-

ber of locations and the number of time frames con-

sidered.

Remark 1. In practice we need to translate the num-

ber of impressions x

i, j,k,l

in terms of probability of de-

livery. We transform a conﬁguration into a map

(i, j, k, l) 7→ Prob

k,l

(i, j)

where Prob

k,l

(i, j) is the computed probability of de-

livery of B

i, j

at location L

and time T

, in that we take

the ratio between x

i, j,k,l

and the expected supply S

l,k

at L

and time T

Pictorially, we could see a single conﬁguration

C = (x

i, j,k,l

)

i, j,k,l

as a tridimensional matrix:

location

time frame





(campaign, creative)

(i, j)



i, j,k,l



3 REALISTIC MODEL

We want to consider a realistic model in which several

constraints of different nature are taken into account.

3.1 The Constraints

We distinguish between the primary (physical) con-

straints of the problem, the secondary ones (commer-

OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of

On-line Advertisements

401

cial and optional) and the learning constraints (re-

quired by the learning phase if it is included in the

mathematical model).

3.2 Primary Constraints

The primary constraints are given by the schedule of

the campaigns, by the limited supply of impressions

and by a (possibly) limited demand (budget):

1. The scheduling of the campaigns limits the admis-

sible points: certain creatives B

i, j

are only pos-

sible at certain times and locations. Typically a

campaign begins and ends at certain times and its

creatives are limited to certain locations, hours of

the day, days of the week, etc.

2. Any location at a given time receives a limited

supply of impressions, which solely depends on

the trafﬁc of its page;

3. For any given campaign a given total proﬁt may

not be exceeded (“demand”) because only a ﬁnite

campaign budget can be available.

Remark 2. Campaigns can have an unbounded bud-

get, e.g. one that only pays for an actual purchase.

3.3 Secondary Constraints

The secondary constraints may be of a commercial

nature. They could be enforced in real time while

monitoring the delivery, although having them as con-

straints improves the accuracy of the model. They are

necessary to increase the visibility of a certain cam-

paign/creative:

1. Any given creative/campaign should not last less

than a given period, e.g. the period in which the

campaign is scheduled. We enforce this by set-

ting a minimum for the number of impressions for

each possible time frame.

2. We would like to avoid having only one creative at

a given location and time frame when more than

one choice is available.

3.4 Learning Constraints

We can embed some learning constraints into the

mathematical model. One way to achieve this can

be a constraint of the form: for each new couple

(creative, location) we must have a minimum num-

ber of impressions in all (or some initial) possible

time frames. This is not strictly necessary because

the same goal can be achieved by using portion of the

trafﬁc for learning. Having these constraints inside

the model produces a more accurate model.

4 LINEAR PROGRAMMING

Under the mild hypothesis that the impression proﬁt

i, j,k,l

is constant with respect to its count x

i, j,k,l

can assume that our constraints are linear. This as-

sumption is not true in general because there is no lin-

ear dependence between the total proﬁt generated by

an impression-event and an impression count, i.e. dis-

playing the same advertisement x times on the same

location, possibly more than once to the same user,

does not necessarily produce x times the proﬁt pro-

duced by one single display.

Since we are ultimately interested in the probabil-

ity of delivery and since integer linear programmingis

computationally unfeasible (NP-hard), a possible ap-

proach to this problem could be real linear program-

ming: we approximate our discrete problem with a

continuous one and we do not mind considering a real

number of impressions.

4.1 Formalized Constraints

The points that do not contradict the ﬁrst primary con-

straint will be the unknowns of our model.

4.1.1 Primary Constraints

We do not include the ﬁrst primary constraints for the

reasons given above and assume that in our expres-

sions all indices run over points that do not contradict

the ﬁrst primary constraints.

Supply and demand are formalized as follows:

Second primary constraint:

∀

l,k

∑

i, j

i, j,k,l

≤ S

l,k

; (supply) (1)

where S

l,k

is the supply at location L

and at time T

Third primary constraint:

∀

∑

j,l,k

i, j,k,l

≤ D

; (demand) (2)

where D

is the budget of the i-th campaign.

Remark 3. If only the primary constraints are taken

into account, we have a “Hitchcock’s style trans-

portation problem” (Hitchcock, 1941). For such

problems very efﬁcient algorithms are known such as

the “stepping stone algorithm” (Dantzig, 1963).

4.1.2 Secondary Constraints

The secondary constraints are formalized as follows:

First secondary constraint:

∀

i,k

∑

j,l

i, j,k,l

≥ µ

i,k

; (duration) (3)

ICAART 2012 - International Conference on Agents and Artificial Intelligence

402

where µ

i,k

is the desired minimum delivery of impres-

sions of the i-th campaign at time T

Second secondary constraint:

∀

l,k∈D

∀

i, j

i, j,k,l

≤ P

l,k

· S

l,k

; (no overﬂow) (4)

where P

l,k

∈ [0, 1] (usually close to 1) deﬁnes how

much a single creative can occupy a location at a

given time frame and where D is the set of indices

corresponding to locations and time frames where at

least 2 different creatives are possible.

Remark 4. The second secondary constraints (4)

should only be limited to those cases in which at a

given location and time more than one pair of cam-

paign and creative is possible because otherwise the

constraint would prevent the location from being ﬁlled

with impressions even when this could be possible.

4.1.3 Learning Constraints

If the learning phase is included in the model some

constraints should force a minimum delivery for the

new creatives and new locations:

∀

new j,l

∀

i,k

i, j,k,l

≥ λ

i, j,k,l

. (5)

We are also implicitly assuming that the un-

knowns are non-negative, i.e.

∀

i, j,k,l

≥ 0. (6)

4.2 The Objective Function

We want to maximize our expected revenue, which is

given by the sum of all expected proﬁts received in a

given conﬁguration C:

F(C) =

∑

i, j,k,l

. (7)

where p

i, j,k,l

is the expected proﬁt generated by

i, j

at location L

and at time T

5 EXISTENCE OF A SOLUTION

We see that there is no guarantee of consistency once

the secondary and learning constraints are introduced,

even if we exclude the ﬁrst primary constraints. We

need to solve the system of inequalities in order to

know if there is a solution. Nevertheless we can

use some heuristic method to avoid some inconsistent

problems. In particular we can ﬁnd some conditions

on µ

i,k

in (3) and λ

i, j,k,l

in (5) under which the prob-

lem cannot have a solution.

Given a set of t-uples T, we introduce the follow-

ing notation for subsets of t − 1-uples:

T[i → α] = {(x

, . . . , x

i−1

, x

i+1

, . . . , x

)

|(x

, . . . , x

i−1

, α, x

i+1

, . . . , x

) ∈ T};

T[i → ∗] = {(x

, . . . , x

i−1

, x

i+1

, . . . , x

)

|∃v|(x

, . . . , x

i−1

, v, x

i+1

, . . . , x

) ∈ T}.

i.e. we are considering respectively

• (t − 1)-uples obtained from t-uples in T, where

i-th component is α and it has been removed;

• (t − 1)-uples obtained from t-uples in T where the

i-th component has been removed independently

of its value.

In the same way, if more components are re-

moved in parallel, we introduce the notation: T[i

→

, . . . , i

→ α

] for (t −n)-uples, where α

∈ N ∪{∗}

and i

∈ N for j ∈ {1, . . . , n}. Here we will consider

the set T of all indices (i, j, k, l) that are allowed by

the ﬁrst primary constraints.

Examples.

T[1 → i, 2 → ∗, 4 → ∗] = {k | ∃ j, l | (i, j, k, l) ∈ T}

T[1 → i] = {( j, k, l) | (i, j, k, l) ∈ T}

T[3 → k, 4 → l] = {(i, j) | (i, j, k, l) ∈ T}

T[2 → ∗, 3 → ∗, 4 → ∗] = {i | ∃ j, k, l | (i, j, k, l) ∈ T}

T[1 → ∗, 2 → ∗] = {(k, l) | ∃i, j | (i, j, k, l) ∈ T}

Fact 1. Let us assume T[1 → i, 2 → ∗, 4 → ∗] 6=

0 for

all i ∈ T[2 → ∗, 3 → ∗, 4 → ∗] (i.e. for all possible

campaignsC

If we choose

i,k

|T[1 → i, 2 → ∗, 4 → ∗]|m

where m = min p

i, j,k,l

, then the semi-algebraic set de-

ﬁned by the inequalities (2), (3) is empty.

Proof. By the ﬁrst secondary constraint (3) we have

∑

j,l

i, j,k,l

|T[1 → i, 2 → ∗, 4 → ∗]|m

Therefore for any campaign C

we have

∑

j,k,l

i, j,k,l

≥

≥ m

∑

j,k,l

i, j,k,l

≥ m

∑

k∈T[1→i,2→∗,4→∗]

∑

j,l

i, j,k,l

∑

k∈T[1→i,2→∗,4→∗]

|T[1 → i, 2 → ∗, 4 → ∗]|

= D

OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of

On-line Advertisements

403

Fact 2. Let us assume T[1 → i], T[3 → k, 4 → l] 6=

for all i ∈ T[2 → ∗, 3 → ∗, 4 → ∗] (i.e. all possible

campaigns C

) and (k, l) ∈ T[1 → ∗, 2 → ∗].

If all banners and locations under consideration

are new and we choose

i, j,k,l

> min



|T[1 → i]|m

k,l

|T[3 → k, 4 → l]|



where m = min p

i, j,k,l

, then the semi-algebraic set de-

ﬁned by the inequalities (1), (2), (5) is empty.

Proof. If λ

i, j,k,l

|T[1→i]|m

then

By (5) we have

i, j,k,l

|T[1 → i]|m

from which it follows that for any campaign C

∑

j,k,l

i, j,k,l

≥ m

∑

j,k,l

i, j,k,l

> m

∑

j,k,l

|T[1 → i]|m

= D

If λ

i, j,k,l

k,l

|T[3→k,4→l]|

then by (5) we have

i, j,k,l

k,l

|T[3 → k, 4 → l]|

from which it follows that for any couple (k, l) ∈

T[1 → ∗, 2 → ∗] we have

∑

i, j

i, j,k,l

∑

i, j

k,l

|T[3 → k, 4 → l]|

= S

k,l

6 FORECASTING DATA

In order to apply our optimization algorithms we need

to have at least a projection of the supply and a pro-

jection of the expected proﬁt of all impressions al-

lowed by the ﬁrst primary constraint. The supply

and impression proﬁts can be estimated by taking a

proper weighted average from the historical data. The

projection should take into account different factors:

episodic factors and possibly different periodicities.

6.1 Projecting the Proﬁt

The proﬁt of an impression-event may depend on the

periodicity of its campaign and of its location. Since

the “impression proﬁt” changes slowly in time, it can

be predicted better than the supply. If we want to de-

termine the expected proﬁt for an impression-event

we can take some average proﬁt from historical data

on “similar” events. Our strategy is to use the most

accurate and recent available information.

6.2 Projecting the Trafﬁc

A model for the projection of the supply should take

into consideration the periodicity of the location, i.e.

some sites are more often visited in particular peri-

ods of the year, day, hours, etc. More periodicities

may concur, e.g. a site may be visited more often in a

speciﬁc day of the week and at a speciﬁc hour of the

day. Regression analysis through machine-learning

techniques such as support vector machines can be a

viable approach for the problem of properly choosing

the weights of the average of the different “features”.

7 FURTHER IMPROVEMENTS

This approach can be improved in its accuracy by tar-

geting the users, and in its speed by reducing the di-

mensions and constraints in the model.

7.1 Targeting Users

The approach we have presented optimizes the deliv-

ery of advertisements in both space (locations) and

time (time frames). The very same algorithms and

code can be used to take users’ proﬁles into ac-

count by encapsulating the proﬁle information into

the information on the location by storing a pair

(location, profile) into a single “extended location”.

7.2 Simplifying Things

The large number of unknowns and constraints in this

general approach can pose a serious problem to its

computable feasibility. We can reduce the dimensions

and constraints by clustering similar attributes, (Abe

and Nakamura, 1999) or by simplifying our model:

• We restrict our problem to periods of time in

which the time constraints do not change. This

greatly reduces the number of unknownsbutcould

produce suboptimal solutions.

• We avoid secondary and learning constraints and

enforce them during the delivery.

• We use a time horizon, beyond which all the time

frames are considered as a single time.

8 RESULTS ON REAL DATA

We have implemented an ad-server optimizer in

Java

. For solving the linear programming model we

use

glpk

. Optionally we use the freely available sup-

http://www.gnu.org/s/glpk/

ICAART 2012 - International Conference on Agents and Artificial Intelligence

404

port vector machines library

to project future supply.

Our implementation requires as input: historical

data necessary for projecting the impression prof-

its and the future supply, campaign data (budgets),

scheduling data.

Our prototype has been used on real data avail-

able at Neodata and has been compared against the

results of the currently used optimizer, which works

as follows: if a campaign is achieving its target at the

current rate, nothing is done, otherwise, it is stopped

in its less proﬁt-generating locations.

We used logs and schedules of two clients of Neo-

data, which, we call A and B. We remark that the

trafﬁc managed by Neodata, neither accounts for the

total trafﬁc nor is it a constant percentage of the trafﬁc

generated by the sites under consideration. A was op-

timized equally well by the current optimizer and our

prototype; whereas B was optimized better by a large

margin (20% - 50%) by our code. In the following

table we show the result of one of our experiments on

the data of April 30th 2010 for company B:

hour real proﬁt opt. proﬁt gain % gain

8:00 a.m. 2.50 4.41 1.91 76%

9:00 a.m. 3.96 8.07 4.11 104%

10:00 a.m. 6.69 12.97 6.28 94%

11:00 a.m. 14.17 23.32 9.15 65%

12:00 a.m. 14.98 24.66 9.68 65%

1:00 p.m. 15.00 14.01 -0.99 -7%

2:00 p.m. 19.43 31.81 12.38 64%

3:00 p.m. 26.07 41.14 15.07 58%

4:00 p.m. 23.38 24.37 0.99 4%

5:00 p.m. 13.98 14.40 0.42 3%

6:00 p.m. 12.64 28.74 16.10 127%

7:00 p.m. 15.90 28.38 12.48 78%

8:00 p.m. 10.55 10.89 0.34 3%

total 179.25 267.17 87.92 49%

Possible reasons why data on A are not optimized

equally well may be: there is no room for further im-

provement; the data on the supply cannot be used for

the projection because it does no correspond to a con-

stant percentage of the real trafﬁc.

The data was used as follows: the initial portion of

the data (e.g. the ﬁrst 20 days) were used for training

the system, i.e. projecting the supply (trafﬁc) and the

proﬁts. The remaining part of the month was used as

a schedule and was optimized.

9 CONCLUSIONS

We have shown how a linear programming approach

can be used to optimize the delivery of on-line ban-

ners. Our approach takes many different constraints

http://www.csie.ntu.edu.tw/˜cjlin/libsvm

into account (schedule, supply, demand, visibility,

learning). We prove that under some conditions, the

corresponding system of inequalities is consistent.

This approach can be used to target users by sim-

ply extending the concept of location. We have also

tackled the problem of dimensionality (by a time hori-

zon, by simplifying the constraints, etc.).

Our prototype has been tested on real data. We

have shown that it optimizes the delivery of on-line

advertisements better than a greedy algorithm.

There are still some open issues: how to project

the trafﬁc when the conditions of the problem change

quickly and the data does not correspond to a constant

percentage of the trafﬁc.

REFERENCES

Abe, N. and Nakamura, A. (1999). Learning to Optimally

Schedule Internet Banner Advertisements. In ICML

’99: Proceedings of the Sixteenth International Con-

ference on Machine Learning, pages 12–21, San Fran-

cisco, CA, USA. Morgan Kaufmann Publishers Inc.

Abrams, Z., Mendelevitch, O., and Tomlin, J. (2007). Opti-

mal delivery of sponsored search advertisements sub-

ject to budget constraints. In Proc. of the 8th ACM

conf. on Electronic commerce, EC ’07, pages 272–

278, New York, USA. ACM.

Alaei, S., Arcaute, E., Khuller, S., Ma, W., Malekian, A.,

and Tomlin, J. (2009). Online allocation of display

advertisements subject to advanced sales contracts. In

Proc. of the 3rd International Workshop on Data Min-

ing and Audience Intelligence for Advertising, AD-

KDD ’09, pages 69–77, New York, USA. ACM.

Caruso, F., Giuffrida, G., and Zarba, C. (2011). Real-time

Behavioral Targeting of Banner Advertising. In Book

of Abstracts of the CLADAG 2011.

Dantzig, G. (1963). Linear Programming and Extensions.

Princeton University.

Hitchcock, F. (1941). The distribution of a product from

several sources to numerous localities. Journal of

Mathematics and Physics, 20:224–230.

Langheinrich, M., Nakamura, A., Abe, N., Kamba, T.,

and Koseki, Y. (1999). Unintrusive customization

techniques for web advertising. Computer Networks,

31(11-16):1259–1272.

Nakamura, A. (2002). Improvements in practical aspects of

optimally scheduling web advertising. In WWW ’02:

Proc. of the 11th international conf. on World Wide

Web, pages 536–541, New York, USA. ACM.

Vee, E., Vassilvitskii, S., and Shanmugasundaram, J.

(2010). Optimal online assignment with forecasts. In

Proc. of the 11th ACM conf. on Electronic commerce,

EC ’10, pages 109–118, New York, USA. ACM.

OPTIMIZED DELIVERY OF ON-LINE ADVERTISEMENTS - A Linear Programming Approach to the Delivery of

On-line Advertisements

405