Discrete OptimisationA dynamic programming framework for optimal delivery time slot pricing
Introduction
The expenditure of US households on online grocery shopping could reach $100 billion in 2022 according to the Food Marketing Institute (2018). Although growth forecasts vary and more conservative estimates lie, for example, at $30 billion for the year 2021 (Pitchbook, 2017), the overall trend is clear: The online grocery sector is likely to grow if some of its main challenges can be overcome.
One of these challenges is managing the logistics as one of the main cost-drivers. In particular, one can seek to exploit the flexibility of customers by offering delivery options at different prices to create delivery schedules that can be executed in a cost-efficient manner. To achieve this, recent proposals include giving customers the choice between narrow delivery time windows for high prices and vice versa (Campbell & Savelsbergh, 2006) or charging customers different prices based on the area and their preferred delivery time (Asdemir, Jacob, Krishnan, 2009, Yang, Strauss, 2017, Yang, Strauss, Currie, Eglese, 2016).
In this paper, we focus on the latter. We refer to the problem of finding the profit-maximising delivery slot prices as the revenue management problem in attended home delivery, where “attended” refers to the requirement that customers need to be present upon delivery of the typically perishable goods, which is in contrast to, for example, standard mail delivery. Note that attended home delivery problems are more complex than standard delivery services, since goods need to be delivered in time windows that are pre-agreed with the customers.
We adopt a dynamic programming (DP) model of an expected profit-to-go function, the value function of the DP, given the current state of orders and time left for customers to book a delivery slot. This DP was initially devised in the fashion industry (Gallego & van Ryzin, 1994), but subsequently adopted and refined by the transportation sector and the attended home delivery industry (Yang et al., 2016). This formulation could be thought of as an instance of a network revenue management problem with customer choice, which finds various applications, e.g. in transportation, hospitality and appointment scheduling problems (see Meissner, Strauss, 2012, Sauré, Patrick, Tyldesley, Puterman, 2012, Zhang, Adelman, 2009).
To find the (approximately) optimal delivery slot prices, we need to compute the value function (at least approximately) for all states and times. The main challenge is that the state space of the DP grows exponentially with the set of delivery time slots, i.e. it suffers from the “curse of dimensionality”. This means that for industry-sized problems, due to the prohibitively large number of states, the value function cannot be computed exactly, even off-line. Our ultimate objective is to compute improved value function approximations. Therefore, we study in this paper how the value function of the exact DP behaves mathematically in time and across state variables.
We show that the underlying DP operator has a unique fixed point. We then provide a closed-form expression of the resulting fixed point and derive a natural interpretation. Furthermore, we show that – under certain technical assumptions – for all time steps in the dynamic program, the value function admits a continuous extension, which is a finite-valued, concave function of its state variables.
Ultimately, our results open the road for achieving scalable implementations of the proposed formulation, as it becomes possible to make informed choices of basis functions in an approximate dynamic programming context. We illustrate our findings on a low-dimensional and an industry-sized numerical example using real-world data from a case study by Yang and Strauss (2017), for which we derive an approximate value function based on our theoretical results and a stochastic dual DP algorithm presented in Zhang and Sun (2019).
Improved value function approximations could finally be used for calculating approximately optimal delivery slot prices. For example, for continuous decision variables and under the multinomial logit customer choice model, Dong, Kouvelis, and Tian (2009) show that a unique set of optimal delivery slot prices exists, which can be found using Newton root search algorithms or using the Lambert function as shown in B.2 if estimates of the value function are known for all states and times. Our mathematical results have immediate implications on the monotonicity of (approximately) optimal prices with respect to changes in the number of placed orders, which we also characterise in this paper. This analysis complements the research on the price-inventory relationship under multinomial logit customer choice (see e.g. Akçay, Natarajan, Xu, 2010, Chen, Chen, 2015, Suh, Aydin, 2011).
Our paper is structured as follows: In the remainder of Section 1, we introduce some notation. In Section 2, we define the revenue management problem in attended home delivery and its DP formulation. In Section 3, we present our main results, Theorem 1, which analytically characterises the fixed point of the DP, and Theorem 2, which shows that there exists a continuous extension of the value function that is a finite-valued, concave function in its state variables at every time step. Section 4 contains reformulations of the DP into mathematically more convenient forms and develops supporting results leading to the proofs of the main results. We also develop a result on the monotonicity of prices with respect to the number of placed orders. Section 5 presents a numerical illustration of our theoretical results on a low-dimensional example and on an industry-sized problem, while Section 6 concludes the paper and suggests directions for future research. The Appendix contains the proofs of results not included in the main body of the paper.
Notation: Let denote a vector with all elements equal to 1. Given some let be a vector of all zeros apart from the -th entry, which equals 1. Furthermore, we define the convention that is a vector of zeros. Let be the non-negative (positive) real numbers, let be the integers and let dim denote the dimension of its argument. Let conv denote the convex hull of its argument. We say that a function exhibits a monotonic behaviour if the monotonicity property holds element-wise, e.g. a function is monotonically increasing over its domain if for all such that at least one element of is greater than the corresponding element of .
Section snippets
Revenue management problem formulation
In this section, we derive a discrete-state formulation of the revenue management problem in attended home delivery.
Infinite time horizon result
We first consider the infinite horizon case, i.e. going backwards infinitely many time steps. In this scenario, we can find a fixed point of the DP described by (4) based on the following assumptions. Assumption 1 The marginal cost of an additional, feasible order is always smaller than the maximum marginal profit, i.e. for all . Assumption 2 We assume that the transition probability density function has the following properties. For any : if . if .
Proof of infinite time horizon theorem
To prove Theorem 1, we first note that the DP in (4) can be reformulated as a so-called stochastic shortest path problem (see Lebedev, Goulart, and Margellos, 2019, Section 4.1.1). The Bellman operator mapping of this class of problems is known to be contractive (see Bertsekas (2012, Chapters 1 and 3) and Lebedev et al. (2019, Lemma 5)). Therefore, the DP in (4) admits a unique fixed point. We start with the necessary and sufficient condition for to have a fixed point which is .
Numerical examples
In the following two sections, we illustrate the validity and practical utility of our results. We first show a low-dimensional numerical example of a 2-slot problem and then how our results allow the application of a non-linear stochastic dual DP algorithm to a 17-slot problem.
Summary of contributions
We have studied the mathematical properties of the value function of a dynamic program modelling the revenue management problem in attended home delivery exactly. We have shown that the recursive dynamic programming mapping has a unique, finite-valued fixed point and concavity-preserving properties. Hence, we have derived our main result stating that – under certain assumptions – for all time steps in the dynamic program, the value function admits a continuous extension, which is a
Acknowledgements
This work was supported by SIA Food Union Management. The authors are grateful for this financial support.
References (34)
- et al.
Dynamic pricing of multiple home delivery options
European Journal of Operational Research
(2009) - Pitchbook (2017). Are meal-kit delivery companies a threat or an opportunity?...
Approximate dynamic programming: solving the curses of dimensionality (Wiley series in probability and statistics)
(2007)- et al.
An approximate dynamic programming approach to attended home delivery management
European Journal of Operational Research
(2017) - et al.
Choice-based demand management and vehicle routing in e-fulfillment
Transportation Science
(2016) - et al.
Stochastic dual dynamic integer programming
Mathematical Programming
(2019) - et al.
Joint dynamic pricing of multiple perishable products under consumer choice
Management Science
(2010) Dynamic Programming and Optimal Control, Vol. II
(2012)- et al.
Optimization over Integers
(2005) - et al.
Convex Optimization
(2004)
Incentive schemes for attended home delivery services
Transportation Science
Recent developments in dynamic pricing research: Multiple products, competition, and limited demand information
Production and Operations Management
Dynamic pricing and inventory control of substitute products
Manufacturing & Service Operations Management
The linear programming approach to approximate dynamic programming
Operations Research
Optimal dynamic pricing of inventories with stochastic demand over finite horizons
Management Science
Cited by (9)
Demand management for attended home delivery—A literature review
2023, European Journal of Operational ResearchDynamic demand management and online tour planning for same-day delivery
2023, European Journal of Operational ResearchEvaluating pricing strategies for premium delivery time windows
2023, EURO Journal on Transportation and LogisticsGradient-bounded dynamic programming for submodular and concave extensible value functions with probabilistic performance guarantees
2022, AutomaticaCitation Excerpt :In this paper, we present a variant of the stochastic dual DP algorithm, termed gradient-bounded DP, for problems with discrete states and value functions that are concave extensible and submodular. One example of a problem whose value function has these properties can be found in the so-called revenue management problem in attended home delivery (Lebedev, Goulart, & Margellos, 2019, 2021). Similar to stochastic dual dynamic (integer) programming, we represent the value function of the DP as the pointwise minimum of affine functions over states.
Operational Research: methods and applications
2024, Journal of the Operational Research Society