Stochastic Dynamic Pricing with Strategic Customers
and Reference Price Effects
Rainer Schlosser
Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
Keywords:
Dynamic Pricing, Strategic Customers, Price Anticipations, Waiting Customers, Reference Prices.
Abstract:
In many markets, customers strategically time their purchase by anticipating future prices and by checking
offer prices multiple times. Typically, customers base their decisions on historical reference prices and their
individual willingness-to-pay, which can change over time. For sellers it is challenging to derive successful
pricing strategies as customers’ reservation prices cannot be observed. In this paper, we present a stochastic
dynamic finite horizon framework to determine optimized price adjustments for selling perishable goods in the
presence of strategic customers, which (i) recur and (ii) anticipate future prices based on reference prices. We
analyze the impact of different strategic behaviors on expected profits and the evolution of sales. Compared to
myopic settings, we find that recurring customers lead to higher average prices, delayed sales, and increased
profits. The presence of forward-looking customers has opposing but less intense effects.
1 INTRODUCTION
In electronic markets, it has become easy to adjust and
to observe offer prices. Customers repeatedly check
prices in order to strategically time their purchase de-
cisions. Occasionally, they even try to anticipate fu-
ture prices based on historical reference prices.
Applications can be found in a variety of contexts,
particularly in the case of perishable products or when
the sales horizon is limited. Prominent examples are,
for instance, the sale of airline tickets, event tickets,
accommodation services, fashion goods, and seasonal
products.
To derive effective dynamic pricing strategies in
the presence of strategic customers is an important
problem in revenue management. Practical relevance
is high, but the problem appears challenging. The
challenge is (i) to account for returning and forward-
looking customers, (ii) to include reference prices,
and (iii) to derive pricing decisions with acceptable
computation times.
In this paper, we study pricing strategies that take
strategic customer behavior into account when updat-
ing offer prices. Existing dynamic pricing techniques
cannot handle such scenarios efficiently and, hence,
force practitioners to limit the scope of their strate-
gies, e.g., by ignoring strategic behavior or by using
simple heuristics. Naturally, this limits the potential
quality of pricing strategies.
The inefficiency of existing techniques to account
for strategic customers stems from several factors that
reflect the challenges behind dynamic pricing: (i) the
solution space of strategies is enormous, (ii) a cus-
tomer’s individual willingness-to-pay is not observ-
able, and (iii) customers track the market to strategi-
cally time their buying decision.
1.1 Related Work
Selling products is a classical application of rev-
enue management theory, cf., e.g., Talluri, van Ryzin
(2004), Phillips (2005), and Yeoman, McMahon-
Beattie (2011). An excellent overview about recent
literature in the field of dynamic pricing is given by
Chen, Chen (2015).
The analysis of the impact of strategic consumer
behavior has been studied since Coase (1972). Sur-
veys about strategic customer behavior in revenue
management are proposed by Su (2007) and Goen-
sch et al. (2013). Further, the recent survey article by
Wei, Zhang (2018) provides a very detailed overview
of publications studying strategic constumers in oper-
ations management.
Wei, Zhang (2018) distinguish three streams of
counteracting strategic customer behavior: (i) pric-
ing, (ii) inventory, and (iii) information. To account
for strategic customers via pricing aims to minimize
a customer’s incentive to wait for future price drops.
Schlosser, R.
Stochastic Dynamic Pricing with Strategic Customers and Reference Price Effects.
DOI: 10.5220/0007519401790188
In Proceedings of the 8th International Conference on Operations Research and Enterprise Systems (ICORES 2019), pages 179-188
ISBN: 978-989-758-352-0
Copyright
c
2019 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
179
This includes strategies such as fixed price strategies,
pre-announced increasing prices (price commitment
strategy), or reimbursements for decreasing prices
(price-matching strategy).
The second stream seeks to mitigate strategic
waiting behavior by limiting the product availability
in order to address a customer’s concern of not being
able to purchase the product in the future (cf. run-
out). Similarly, sellers can also counteract strategic
consumer behavior by strategically announcing (or
hiding) partial inventory information to highlight the
product’s scarcity.
As typically observed in practice, in our model,
we allow for free price adjustments. On average, how-
ever, this leads to comparatively stable price paths.
These reference prices can be estimated by strate-
gic customers, cf. Wu et al. (2015). Further mod-
els focusing on reference price effects are studied by
Popescu, Wu (2007), Wu, Wu (2015), or Chenavaz,
Paraschiv (2018).
Finally, the key question is how (i) reference
prices, (ii) consumer’s price anticipations, and (iii) a
firm’s pricing strategy affect each other. Further, the
mutual dependencies will have to be determined by
buyers and sellers based on their partially observable
asymmetric (market) data. In addition, the complex
interplay of their mutual beliefs is further complicated
when multiple seller compete for the same market, cf.,
e.g., Levin et al. (2009), Liu, Zhang (2013).
1.2 Contribution
In the literature, for simplicity, mostly so-called my-
opic customers are considered. They simply arrive
and decide; they do not return to check prices again
and they do not anticipate future prices.
In our model, we consider the following two
sources of strategic customer behavior. First, we al-
low that customers return with a certain probability
in case they refuse to buy, i.e., if their willingness-
to-pay (WTP) does not exceed the current offer price.
To reflect planning uncertainty over time, we model a
customer’s future WTP as a random variable. Second,
we allow a certain share of customers to anticipate fu-
ture offer prices (based on the current offer price and
predetermined reference prices) as well as their indi-
vidual future WTP in order to check whether there is
an incentive to delay their purchase decision - even if
their current WTP exceeds the offer price. This allows
customers to optimize their consumer surplus.
We study a finite horizon model with limited ini-
tial inventory (i.e., products cannot be reproduced or
reordered). While in the literature demand is often
assumed to be of a special highly stylized functional
form, we allow for fairly general demand definitions.
In our model, demand is characterized by randomized
evolutions of individual WTP, which are not observ-
able for the seller. To this end, demand is allowed to
generally depend on time, the current offer price, and
reference prices.
The main contributions of this paper are the
following: We (i) present a demand model which
is based on individual reservation prices, reference
prices, and expected consumer surpluses, (ii) we com-
pute optimized feedback pricing strategies, (iii) we
study the impact of different strategic behaviors com-
pared to myopic settings, and (iv) we propose a Hid-
den Markov version of our model.
This paper is organized as follows. In Section 2,
we describe our model setup and define strategic cus-
tomer behaviors. In Section 3, we present our solution
approach and illustrate its results using different nu-
merical examples. In Section 4, we study a relaxed
version of our model with partially observable states.
Conclusions are summarized in the final Section 5.
2 MODEL DESCRIPTION
We consider the situation in which a firm wants to sell
a finite number of goods (e.g., airline tickets, event
tickets, accommodation services) over a certain time
frame. We assume a monopoly situation. Further, a
certain ratio of customers acts strategically, i.e., they
(i) repeatedly track prices and wait for acceptable of-
fers and (ii) they are forward-looking, i.e., they com-
pare their current consumer surplus with expected fu-
ture consumer surpluses.
We assume that the time horizon T is finite. We
assume that products cannot be reproduced or re-
ordered. If a sale takes place, shipping costs c have
to be paid, c 0. A sale of one item at price a, a 0,
leads to a profit of a c. Discounting is also included
in the model. For the length of one period, we use the
discount factor δ, 0 δ 1.
2.1 Individual Buying Behavior
We consider a discrete time model with T periods.
We assume that consumers have individual (random)
reservation prices denoted by R
t
for periods (t,t + 1),
t = 0,1,...,T 1. The reservation prices particularly
account for a customer’s planning uncertainty to be
able to benefit from the product (e.g., an airline ticket
or an event ticket in time T ). In this context, the av-
erage planning uncertainty typically decreases over
time as the time horizon T gets closer.
ICORES 2019 - 8th International Conference on Operations Research and Enterprise Systems
180
Further, customers may check current (ticket)
prices at multiple points in time. While some cus-
tomers start to check prices early (cf. economy class)
other customers start to check prices shortly before
the end of the sales period (cf. business class). More-
over, the evolutions of reservation prices of each in-
dividual customer are different. Figure 1 illustrates
individual reservation prices R
t
and average reserva-
tion prices (denoted by
¯
R
t
) over time. We observe
that, on average, the willingness-to-pay is increasing
over time. The individual paths resemble, e.g., ran-
dom (sudden) changes of a customer’s planning un-
certainty. In other use-cases (e.g., the sale of fashion
goods) reservation prices may also tend to decrease.
0 10 20 30 40 50
t
100
200
300
400
R
t
Figure 1: Examples of individual reservation prices R
t
and
average reservation prices
¯
R
t
(smooth blue curve) over time.
We assume that forward-looking customers com-
pare their current individual consumer surplus with
their expected future surplus of the next period. Given
a current offer price a
t
and an individual reservation
price R
t
at time t, the current consumer surplus is
CS
t
:= R
t
a
t
. The expected future surplus CS
t+1
of
the next period depends on the (expected) offer price
a
t+1
and the (expected) individual reservation price
R
t+1
of the consumer at time t + 1. Finally, a cus-
tomer purchases a product at time t only if condition
(i) is satisfied: The current consumer surplus is posi-
tive, i.e., t = 0, 1, ...,T 1,
CS
t
0 (1)
Further, for some consumer, also a second condition
(ii) has to be satisfied: The expected future surplus
E(CS
t+1
) does not exceed CS
t
plus a certain risk pre-
mium ε (e.g., mirroring a customer’s risk aversion
about a product’s future availability), so that there is
no incentive to wait for a higher consumer surplus),
t = 0, 1,...,T 1, ε 0,
R
t
a
t
E(R
t+1
|R
t
) E(a
t+1
) ε (2)
Consumers are assumed to be able to estimate fu-
ture prices E(a
t+1
), e.g., based on average historical
reference prices. In our model, we assume a known
predetermined path of reference prices denoted by,
t = 0, 1,...,T 1,
a
(re f )
t
(3)
Based on a current offer price a
t
and the reference
price a
(re f )
t+1
, cf. (3), consumers can estimate expected
future prices, e.g., by, t = 0, 1,...,T 2,
E(a
t+1
) (a
(re f )
t+1
+ a
t
)/2
We assume that a certain share of the consumers
are forward-looking and consider condition (2). This
share is denoted by, t = 0, 1,...,T 1,
γ
t
[0,1] (4)
Finally, given the distribution of the expected evolu-
tion of a random customer’s reservation price R
t
, from
(1) - (3), we obtain the average purchase probabil-
ity of a random customer arriving in period (t,t + 1),
t = 0, 1,...,T , a 0, ε 0,
p
(buy)
t
(a) := (1 γ
t
) · P (R
t
> a)
+γ
t
· P
R
t
> a and
R
t
a E(R
t+1
|R
t
)
a
(re f )
t+1
+a
2
ε
!
(5)
The purchase probabilities (5) are characterized
by (i) the consumers’ mixture of reservation prices,
(ii) the share of forward-looking customers, cf. (4),
and (iii) the reference prices, cf. (3).
Note, we do not assume that reservation prices are
observable for the seller. As in practice, we only as-
sume arriving customers and realized sales to be ob-
servable fro sellers. The probabilities (5) can be esti-
mated by the conversion rate of interested and buying
customers for different offer prices at different time t,
cf., e.g., Schlosser, Boissier (2018).
2.2 Waiting Customers
Typically, customers tend to track the market and ob-
serve prices over time. In the literature, however, cus-
tomers are often assumed to be myopic, i.e., they ran-
domly occur, observe offers, and decide whether to
purchase or not. In case of no purchase they do not
further track the offer.
In reality, many customers are not myopic. In
our model, we consider recurring customers. In case
an interested customer does not purchase, we assume
that he/she checks the next period’s offer with a cer-
tain probability denoted by, t = 0, 1,...,T 1,
η
t
[0,1] (6)
Stochastic Dynamic Pricing with Strategic Customers and Reference Price Effects
181
Note, this nontrivially affects the arrival process
of potential customers. In the following, we distin-
guish between initially arriving (new) customers and
waiting/recurring (old) customers.
Arriving new customers are modelled as fol-
lows. We assume arbitrary given probabilities, t =
0,1,...,T 1, j = 0,1,...,
p
(new)
t
( j ) (7)
that in period (t,t +1) exactly j new customers arrive,
where
j0
p
(new)
t
( j) = 1 for all t = 0, 1, ...,T 1.
For the time being, we assume that the number of
waiting customers can be effectively determined by
the selling firm as new arriving customers and old re-
curring customers can be observed (cf. cookies, etc.).
The random number of customers that did not pur-
chase in period (t 1,t) and recur in the next period
(t,t + 1) are denoted by K
t
, t = 0,1,...,T 1. A list
of variables and parameters is given in the Appendix,
cf. Table 5.
2.3 Problem Formulation
In our model, we use sales probabilities that depend
on (i) the number of arriving new customers j, (ii) the
number recurring waiting customers k, and (iii) the
offer price a. The individual purchase decisions, cf.
(5), are based on individual (expectations of future)
reservation prices and predetermined reference prices.
The random inventory level of the seller at time t
is denoted by X
t
, t = 0, 1, ..., T . The end of sale is the
random time τ, when all of the seller’s items are sold,
that is τ := min
t=0,...,T
{
t : X
t
= 0
}
T . As long as the
seller has items left to sell, for each period (t,t + 1),
a price a
t
has to be chosen. By A we denote the set
of admissible prices. For all remaining t τ the firm
cannot sell further items and we let a
t
:= 0.
We call strategies (a
t
)
t
admissible if they belong
to the class of Markovian feedback policies; i.e., pric-
ing decisions a
t
0 will depend on (i) time t, (ii) the
current inventory level X
t
, and (iii) the current number
of waiting customers K
t
.
Depending on the chosen pricing strategy (a
t
)
t
,
the random accumulated profit from time/period t on
(discounted on time t) amounts to, t = 0, 1, ..., T ,
G
t
:=
T 1
s=t
δ
st
· (a
s
(X
s
,K
s
) c) · (X
s+1
X
s
) (8)
The objective is to determine a (Markovian) feed-
back pricing policy that maximizes the expected total
discounted profits, t = 0,1,...,T ,
E(G
t
|X
t
= n,K
t
= k) (9)
conditioned on the current state at time t (cf. inven-
tory level n and waiting consumers k). An optimized
policy will balance expected short-term and long-term
profits by accounting for the evolution of the inven-
tory level and the number of waiting customers.
3 COMPUTATION OF OPTIMAL
PRICING STRATEGIES WITH
OBSERVABLE STATES
In this section, we want to derive optimal feedback
pricing strategies that incorporate the strategic cus-
tomer behavior described in Section 2.
3.1 State Transition Probabilities
The state of the system to be controlled over time is
described by time t, the current inventory level n, and
the current number of waiting customers k. The tran-
sition dynamics can be described as follows. Given
an inventory level n at time t and a demand for i items
during the period (t,t + 1), we obtain the new inven-
tory level X
t+1
:= max(n i, 0).
Given k waiting customers at time t and j new ar-
riving customers, cf. (7), we have k + j interested
customers in period (t,t + 1). Assuming i buying
customers, i = 0,...,k + j, cf. (1) - (5), we obtain
k + j i customers that did not purchase an item. If
m of them plan to recur in (t + 1,t + 2) the new state,
i.e., the number of waiting customers at time t + 1 is
m, m = 0,...,k + j i.
Assuming k waiting customers and j new cus-
tomers, the probability that exactly i items can be
sold at period (t,t + 1) is binomial distributed, i =
0,...,k + j, t = 0, 1,...,T 1, cf. (5),
p
(demand)
t
(i|a,k, j)
=
k + j
i
· p
(buy)
t
(a)
i
·
1 p
(buy)
t
(a)
k+ ji
(10)
Assuming k waiting customers at time t, j new
customers, and i customers that want to buy, the prob-
ability that m of the remaining k + j i customers re-
turn in period (t +1,t +2) is also binomial distributed,
m = 0,...,k + j i, t = 0, 1, ...,T 1, cf. (6),
p
(wait)
t
(m|k, j,i)
=
k + j i
m
· η
m
t
· (1 η
t
)
k+ jim
(11)
ICORES 2019 - 8th International Conference on Operations Research and Enterprise Systems
182
3.2 Solution Approach
The problem of finding the best pricing strategy can
be solved using dynamic programming techniques. In
this context, the so-called value function describes the
best expected discounted future profits E(G
t
|n,k) for
all possible states n and k at time t, cf. (9).
If either all items are sold or the time is up, no
future profits can be made, i.e., the natural boundary
conditions for the value functions V are given by, t =
0,1,...,T 1, n = 0,1,...,N, k = 0,1,...,
V
t
(0,k) = 0 and V
T
(n,k) = 0 (12)
For the remaining states, the value function is
determined by the Bellman equation, n = 0,1,...,N,
k = 0,1,...,M, t = 0,1,..., T 1, cf. (7), (10), (11),
V
t
(n,k) = max
aA
j=0,1,...,J
i=0,1,..., j+k
m=0,1,..., j+ki
p
(new)
t
( j )
·p
(demand)
t
(i
|
a, j , k ) · p
(wait)
t
(m
|
j, k,i )
·((a c) · min(i,n) + δ ·V
t+1
(max(n i, 0), m))
}
(13)
Note, to obtain a bounded number of potential
events, in (13) we use a maximum number of new cus-
tomers J. To guarantee a limited state space, we use
a maximum number M of waiting customers. Both
bounds have to be chosen sufficiently large such that
the optimal solution is not confined.
The nonlinear system of equations (13) can be
solved recursively. The associated optimal pric-
ing policy denoted by a
t
(n,k), n = 0, 1, ...,N, k =
0,1,...,M, t = 0,1,...,T 1, is determined by the arg
max of (13), i.e.,
a
t
(n,k) = argmax
aA
j=0,1,...,J
i=0,1,..., j+k
m=0,1,..., j+ki
p
(new)
t
( j)
·p
(demand)
t
(i
|
a, j , k ) · p
(wait)
t
(m
|
j, k,i )
·((a c) · min(i,n) + δ ·V
t+1
(max(n i, 0), m))
}
(14)
If a
t
(n,k) is not unique, we choose the smallest one.
3.3 Numerical Example
To illustrate our solution approach, we consider the
following numerical example.
Example 3.1. We let T = 50, N = 20, δ = 1, c =
10, J = 5, M = 8, a A := {10,20, ..., 500}, ε :=
U(0, 20), and γ
t
= η
t
= 0.5. Further, we use:
(i) Arrival of new customers, cf. (7), j = 0,1,...,
p
(new)
t
( j) =
J
j
· u
j
t
· (1 u
t
)
J j
where u
t
= 1 e
|
t/T 0.6
|
, 0 t < T .
(ii) Individual reservation prices, cf. (5), 0 t < T ,
R
t
=
L
(min)
,t = 0
i f U (0,1) < 0.75 then R
t1
else H
t
,t > 0
where L
(min)
:= U(0, 200) (initial reservation price),
L
(max)
:= U (100,800) (upper reservation price), and
H
t
:= L
(min)
+ D
t
·U(0.6,1.4) (random updates) with
D
t
:= E
L
(max)
L
(min)
· (t/T )
2
(average increase).
By U(·,·), we denote Uniform distributions.
(iii) Reference prices, cf. (3), 0 t < T ,
a
(re f )
t
:= 150 +200 · (t/T )
2
(iv) We use 10 000 random realization of R
t
, cf.
(ii), to determine the average reservation prices
¯
R
t
:=
E(R
t
) and the conditional expectations E(R
t+1
|R
t
),
which are the basis to derive p
(buy)
t
(a), a A, cf. (5).
Figure 2 depicts purchase probabilities p
(buy)
t
(a)
in the setting of Example 3.1 for different periods t
and prices a. Note, the seller cannot infer individual
reservation prices from p
(buy)
t
.
0 100 200 300 400 500
a
0.2
0.4
0.6
0.8
1.0
p
t
(buy)
(a)
t=10
t=45
Figure 2: Purchase probabilities p
(buy)
t
(a), cf. (5), for dif-
ferent prices a 500 and periods t = 10, 20,30,40,45; Ex-
ample 3.1.
Stochastic Dynamic Pricing with Strategic Customers and Reference Price Effects
183
Table 1 illustrates the value function V
t
for dif-
ferent inventory levels n and points in time t (for the
case that the number of waiting customers is k = 3),
cf. (13). We observe that expected future profits are
(convex) decreasing in time and (concave) increasing
in the number of items left to sell.
Table 1: Expected profits V
t
(n,3), Example 3.1.
n\t 0 10 20 30 40 45
1 350 350 350 350 356 356
2 654 654 654 655 655 637
5 1409 1409 1410 1410 1351 1265
10 2413 2413 2414 2383 2037 1475
15 3246 3239 3219 3035 2114 1476
20 3924 3883 3763 3230 2115 1476
Table 2 shows the expected profits for different
inventory levels n and different numbers of waiting
customers k (for time t = 45). We observe that the
expected future profits are increasing in the number
of waiting consumers k. The impact of k is higher the
larger the remaining inventory is.
Table 2: Expected profits V
45
(n,k), Example 3.1.
n\k 0 1 2 3 4 5
1 288 318 339 356 369 380
2 492 552 599 637 667 693
5 744 953 1131 1265 1349 1408
10 762 1000 1238 1475 1711 1942
15 762 1000 1238 1476 1714 1952
20 762 1000 1238 1476 1714 1952
Table 3 illustrates optimal feedback prices a
t
, cf.
(14), for different inventory levels n and points in time
t (for the case that the number of waiting customers
is k = 3), cf. Table 1. We observe that offer prices
are decreasing in n and increasing in time t. Further,
prices are (slightly) increasing in k. The impact of k
is higher if n is small and t is large.
Note, if n and t are small, optimal prices are cho-
sen such that the probability to sell is basically 0, cf.
Figure 2. This way, it is ensured that items are not
sold too early. Due to increasing demand (cf. Figure
1) items can be sold later at higher prices.
Table 3: Optimal prices a
t
(n,3), Example 3.1.
n\t 0 10 20 30 40 45
1 200 220 280 370 410 420
2 200 220 280 330 370 380
5 200 220 250 280 280 290
10 200 200 220 230 250 290
15 180 180 190 200 250 290
20 160 160 170 190 250 290
Remark 3.1. (Properties of expected profits)
(i) The expected profits are increasing in the
inventory level.
(ii) If there is no discounting then the expected
profits are increasing in time-to-go.
(iii) The expected profits are increasing in the
number of waiting customers, especially if
time-to-go is small and the inventory is large.
Remark 3.2. (Properties of feedback prices)
(i) The optimal prices are decreasing in the
inventory level.
(ii) If demand is strongly increasing in time then
the optimal prices are increasing in the time.
(iii) The optimal prices are increasing in the
number of waiting customers, especially if
time-to-go is small and the inventory is small.
Figure 3 illustrates evaluated average prices E(a
t
)
(orange curve), which are, on average, increasing over
time. Compared to E(a
t
) the average reservation
prices
¯
R
t
(blue curve) are smaller in the beginning and
higher at the end of the sales period. Hence, sales are
likely to occur late. The green curve depicts the pre-
determined reference prices a
(re f )
t
, which are overall
consistent with the evaluated offer prices E(a
t
).
0 10 20 30 40 50
t
100
200
300
400
Ea
t
*
a
t
(ref )
R
t
Figure 3: Average evaluated price paths E(a
t
) (orange
curve), average reservation prices
¯
R
t
(blue curve), and ref-
erence prices a
(re f )
t
(green curve); Example 3.1.
Remark 3.3. In order to make the model fully consis-
tent, the resulting average prices obtained (cf. E(a
t
))
can be used to define the expected reference price
paths a
(re f )
t
, cf. (3). This way, adaptive reconfigu-
rations and iterated model solutions can be used to
obtain converging equilibrium price paths where the
evaluated optimized solution (E(a
t
)) coincides with
the underlying expectations of customers (a
(re f )
t
).
ICORES 2019 - 8th International Conference on Operations Research and Enterprise Systems
184
3.4 Strategic vs. Myopic Customers
In this section, we study the impact of different setups
of strategic customer behaviors, cf. (4) and (6).
Example 3.2. We assume the setting of Example 3.1.
We consider different combinations of (time consis-
tent) parameters η
t
= η (return probability), and γ
t
= γ
(share of forward-looking customers) characterizing
the customers’ strategic behavior.
Table 4: Expected total profits E(G
0
) for different strategic
customer behaviors; Example 3.2.
Set ting η γ E(G
0
)
I 0 0 3 455
II 0 1 3 390
III 1 0 6 241
IV 1 1 4 940
V 0.5 0.5 3 924
Table 4 summarizes the expected profits for five
different customer behaviors, cf. Example 3.2. As
a reference, setting I represents the classical myopic
customer behavior without any kind of strategic ef-
fects. In setting II, all consumers are forward-looking,
cf. (4), but do not recur. In setting III, all consumers
recur, cf. (6), but are not forward-looking. In setting
IV, all consumers are forward-looking and steadily re-
cur. Setting V corresponds to the mixed setup of Ex-
ample 3.1. Comparing the expected profits in Table 4,
we observe that the strategic behavior has a significant
impact on a seller’s sales results.
0 10 20 30 40 50
t
100
200
300
400
500
a
t
Figure 4: Expected evolution of optimal prices for different
strategic behaviors (settings: I blue, II orange, III green, IV
red); Example 3.2.
Figure 4 illustrates the expected price paths as-
sociated to the first four settings I - IV, cf. Table 4.
While the increasing shape of the four paths is overall
similar, the level of prices differs significantly. The
highest (lowest) prices correspond to setting III (set-
ting II). The moderate setting V, cf. Example 3.1 and
Figure 2, corresponds to a mixture of setting I and IV,
i.e. the blue and the red curve.
For setting I - IV, Figure 5 depicts the associated
inventory levels over time. All four curves are of con-
cave shape. Most of the sales are realized at the end
of the time horizon, which is due to the increasing de-
mand (i.e., reservation prices). The low return proba-
bility of setting I and II (blue and orange curve) leads
to sales that occur comparatively early. Instead, in
setting III and IV (green and red curve) with recurring
customers sales can be realized later and the number
of unsold items can be reduced.
0 10 20 30 40 50
t
5
10
15
20
X
t
Figure 5: Expected evolution of the inventory level for dif-
ferent strategic behaviors (settings: I blue, II orange, III
green, IV red); Example 3.2.
Further numerical experiments led to similar re-
sults. Finally, we summarize our observations in the
following remark.
Remark 3.4. (Impact of strategic behavior)
(i) Compared to myopic settings a higher share (η)
of patient recurring customers leads to higher prof-
its. The (predictable) higher number of interested cus-
tomers results in higher prices and delayed sales.
(ii) A higher share (γ) of forward-looking cus-
tomers leads to lower profits. Customers strategical
time their purchase in order to increase their con-
sumer surplus. Further, due to suspended decision-
making, sales that would have been realized in my-
opic settings may even get lost. Hence, a higher
number of anticipating customers forces the seller to
lower prices which, in turn, leads to earlier sales.
(iii) Finally, the two effects, i.e., the quantities η
and γ, are counteractive. When comparing both ef-
fects, we observe that the impact of anticipating cus-
tomers is overcompensated by those of recurring cus-
tomers.
Stochastic Dynamic Pricing with Strategic Customers and Reference Price Effects
185
4 PRICING STRATEGIES WHEN
WAITING CUSTOMERS ARE
NOT OBSERVABLE
In real-life applications, the number of returning cus-
tomers is typically not exactly known. Further, it
might not always be possible to distinguish between
new and recurring customers. In this section, we show
how to adjust the model presented in Section 3 by
using probability distributions of waiting customers.
This makes it possible to still compute effective strate-
gies although less information is available.
4.1 Return Probabilities
While the number of recurring customers is often not
observable, the average return probability can be eas-
ily estimated. Based on average return probabilities it
is possible to derive a probability distribution for the
number of recurring customers.
The key idea is to exploit the model with full in-
formation, cf. Section 3, and to use probabilities for
waiting customers (cf. Hidden Markov Model).
The probability that k customers return in period
(t,t + 1) is denoted by π
(k)
t
:= P
t
(K
t
= k). We can
estimate π
(k)
t
based on the observable number of in-
terested customers v, i.e., the sum of old and new cos-
tomers, cf. v := k + j, see Section 3.1. Assume in a
period (t 1,t), we observed v interested customers
and i buyers. Hence, we have, cf. (11), k = 0,1,...,M,
t = 0, 1,...,T , v = 0,1,...,J + M, i = 0,1,...,v,
π
(k)
t
:= P(K
t
= k|v,i)
=
v i
k
· η
k
t
· (1 η
t
)
vik
(15)
For all k > v i, we obtain π
(k)
t
:= 0. Note, as (15)
is based on the observable number of interested (v)
and buying customers (i), we do not need to be able
to distinguish between new and old customers.
4.2 Computation of Prices
Next, we compute optimized strategies. We use given
(state) probabilities π
(k)
t
, cf. (15), and the value func-
tion V
t
(n,k), cf. (13), of the scenario with full in-
formation, see Schlosser, Richly (2018) for a similar
HMM approach. We define the following heuristic
pricing strategy denoted by ˜a
t
(n) for the scenario with
unobservable recurring customers, n = 0,1,...,N, k =
0,1,...,M, t = 0, 1, ...,T ,
˜a
t
(n) = argmax
aA
k=0,...,M
π
(k)
t
·
j=0,1,...,J
i=0,1,..., j+k
m=0,1,..., j+ki
p
(new)
t
( j)
·p
(demand)
t
(i
|
a, j , k ) · p
(wait)
t
(m
|
j, k,i )
·((a c) · min(i,n) + δ ·V
t+1
(max(n i, 0), m))
}
(16)
Algorithm 4.1. Use (13), (15), and (16) in the fol-
lowing order to compute ˜a
t
(X
t
), t = 0, 1, ...,T 1:
(i) Compute the values V
t
(n,k) for all n =
0,1,...,N, k = 0,1,...,M, and t = 0,...,T via (13).
(ii) In t = 0 let π
(k)
0
:= 1
{k=0}
. Compute the price
˜a
0
(N) using (16) for the initial inventory X
0
:= N.
(iii) For all t = 1, ..., T 1 observe the number v of
interested customers in period (t 1,t) and the num-
ber i of buying customers. Given v and i compute π
(k)
t
,
k = 0, 1, ..., M, via (15). Let X
t
= X
t1
i. Use π
(k)
t
to compute the price ˜a
t
(X
t
) for the current inventory
level X
t
, cf. (16).
4.3 Numerical Examples
In the following example, we demonstrate the appli-
cability and the quality of our Hidden Markov ap-
proach, cf. Algorithm 4.1.
Example 4.1. We assume the setting of Example 3.1.
We assume that recurring customers cannot be ob-
served. We consider η
t
= 0.5 and γ
t
= 0.5.
0 10 20 30 40 50
t
100
200
300
400
E(a
t
*
)
a
t
(ref )
E(a
˜
t
)
Figure 6: Average evaluated price paths E( ˜a
t
) (blue curve),
cf. Algorithm 4.1, compared to optimal price paths E(a
t
)
(orange curve) of the model with full information, cf. Fig.
2, and reference prices a
(re f )
t
(green curve); Example 4.1.
ICORES 2019 - 8th International Conference on Operations Research and Enterprise Systems
186
Figure 6 illustrates average evaluated price curves
of our heuristic strategy ˜a
t
compared to the opti-
mal strategy a
t
, which, in contrast to the heuris-
tic, takes advantage of being able to observe waiting
customers. We observe that both curves are almost
identical which indicates that realized prices of both
strategies are similar.
0 10 20 30 40 50
t
1000
2000
3000
4000
E
(G
t
)
E(G
t
|a
t
*
)
E(G
t
|a
˜
t
)
Figure 7: Average accumulated profits E(
¯
G
t
| ˜a
t
) (blue
curve) of strategy ˜a
t
, cf. Algorithm 4.1, compared to av-
erage accumulated profits E(
¯
G
t
|a
t
) (orange curve) of the
optimal policy a
t
of the model with full information, cf.
(14); Example 4.1.
Figure 7 shows the corresponding evolutions of
accumulated profits up to time t (denoted by
¯
G
t
). The
curves verify that the performance of the heuristic
strategy is close to optimal. For other settings of the
customer behavior we obtain similar results.
5 CONCLUSION
In this paper, we proposed a stochastic dynamic fi-
nite horizon framework for sellers to determine price
adjustments in the presence of strategic customers.
Compared to classical myopic setups, we consider
customers that (i) recur and (ii) compare the current
consumer surplus against an expected future surplus,
which is based on anticipated future prices.
Given an initial inventory level our pricing strat-
egy maximizes expected profits by accounting for
time-dependent demand and the number of returning
consumer, which constitute a (predictable) additional
demand potential.
The strategic behavior of the customers has a sig-
nificant impact on prices and expected profits. We
find that recurring customers lead to higher average
prices, delayed sales, and most importantly higher
profits. The presence of forward-looking customers
has opposing effects. However, it turns out that the
latter impact is overcompensated by the effect of re-
curring customers.
Using a Hidden Markov model (HMM), we also
consider the case in which returning customers can-
not be observed by the seller. By comparing solutions
of the extended model and the previous model which
exploits full information, we verified that the perfor-
mance of the HMM model is close to optimal.
Our framework is characterized by the average
evolution of customers’ reservation prices and the
predetermined reference prices. In future work, we
will study the case in which the reference prices are
updated by the evaluated average offer prices of the
model. To this end, the current model can be adap-
tively resolved until reference prices and evaluated
average prices are fully consistent. Alternatively, it
might also be possible to endogenize the impact of a
seller’s price adjustments on current reference prices.
REFERENCES
Chen, M., Z.-L. Chen. 2015. Recent Developments in Dy-
namic Pricing Research: Multiple Products, Compe-
tition, and Limited Demand Information. Production
and Operations Management 24 (5), 704–731.
Chenavaz, R., C. Paraschiv. 2018. Dynamic Pricing for In-
ventories with Reference Price Effects. Economics E-
Journal 12 (64), 1–16.
Coase, R. H. 1972. Durability and Monopoly. Journal of
Law and Economic 15 (1), 143–49.
Goensch, J., R. Klein, M. Neugebauer, C. Steinhardt. 2013.
Dynamic Pricing with Strategic Customers. Journal of
Business Economics 83 (5), 505–549.
Levin, Y., J. McGill, M. Nediak. 2009. Dynamic Pricing in
the Presence of Strategic Consumers and Oligopolistic
Competition. Management Science 55, 32–46.
Liu, Q., D. Zhang. 2013. Dynamic Pricing Competition
with Strategic Customers under Vertical Product Dif-
ferentiation. Management Science 59 (1), 84–101.
Phillips, R. L. 2005. Pricing and Revenue Optimization.
Stanford University Press.
Popescu, I., Y. Wu. 2007. Dynamic Pricing Strategies with
Reference Effects. Operations Research 55 (3), 413–
429.
Schlosser, R., K. Richly. 2018. Dynamic Pricing Strategies
in a Finite Horizon Duopoly with Partial Information.
7th International Conference on Operations Research
and Enterprise Systems (ICORES 2018), 21–30.
Schlosser, R., M. Boissier. 2018. Dynamic Pricing Compe-
tition in E-Commerce: A Data-Driven Approach. 24th
ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining 2018 (KDD 2018),
705–714.
Su, X. 2007. Intertemporal Pricing with Strategic Customer
Behavior. Management Science 53 (5), 726–741.
Talluri, K. T., G. van Ryzin. 2004. The Theory and Practice
of Revenue Management. Kluver Academic Publish-
ers.
Stochastic Dynamic Pricing with Strategic Customers and Reference Price Effects
187
Wei, M. M., F. Zhang. 2017. Recent Research Develop-
ments of Strategic Consumer Behavior in Operations
Management. Computers and Operations Research
93, 166–176.
Wu, S. , Q. Liu, R. Q. Zhang. 2015. The Reference Effects
on a Retailers Dynamic Pricing and Inventory Strate-
gies with Strategic Consumers. Operations Research
63 (6), 1320–1335.
Wu, L.-L., D. Wu. 2015. Dynamic Pricing and Risk Analyt-
ics under Competition and Stochastic Reference Price
Effects. IEEE Transactions on Industrial Informatics
12 (3), 1282–1293.
Yeoman, I., U. McMahon-Beattie. 2011. Revenue Man-
agement: A Practical Pricing Perspective. Palgrave
Macmillan.
APPENDIX
Table 5: List of variables and parameters.
T time horizon / number of periods
t time / period
N initial inventory level
X
t
random number of items to sell in t
K
t
random number of waiting customers
G
t
random future profits from t on
c shipping costs
δ discount factor
R
t
individual reservation price in t
¯
R
t
average reservation price in t
a
(re f )
t
reference price in t
A set of admissible prices
n current inventory level
j number of new customers
i number of buying customers
k number of waiting customers
V
t
(n,k) value function
a offer price
CS
t
individual consumer surplus in t
p
(buy)
t
(a) purchase probability for price a
p
(new)
t
( j ) probability for j new customers
p
(wait)
t
(k) probability for k waiting customers
p
(demand)
t
(i) probability for i sales in period t
a
t
(n,k) optimal prices (full information)
π
(k)
t
beliefs for k waiting customers
˜a
t
(n) heuristic prices (HMM model)
η
t
share of waiting customers
γ
t
share of anticipating customers
¯
G
t
random accumulated profits up to t
ICORES 2019 - 8th International Conference on Operations Research and Enterprise Systems
188