Discrete Optimisation
A dynamic programming framework for optimal delivery time slot pricing

https://doi.org/10.1016/j.ejor.2020.11.010Get rights and content

Highlights

  • Dynamic programming framework for revenue management in attended home delivery.

  • Characterisation of the fixed point in an infinite time horizon setting.

  • The finite time horizon value function is concave extensible and submodular.

  • Non-linear stochastic dual dynamic programming can be used to derive pricing policy.

  • High-dimensional numerical example based on real-world case study data.

Abstract

We study the dynamic programming approach to revenue management in the context of attended home delivery. We draw on results from dynamic programming theory for Markov decision problems to show that the underlying Bellman operator has a unique fixed point. We then provide a closed-form expression for the resulting fixed point and show that it admits a natural interpretation. Moreover, we also show that – under certain technical assumptions – the value function, which has a discrete domain and a continuous codomain, admits a continuous extension, which is a finite-valued, concave function of its state variables, at every time step. Furthermore, we derive results on the monotonicity of prices with respect to the number of orders placed in our setting. These results open the road for achieving scalable implementations of the proposed formulation, as it allows making informed choices of basis functions in an approximate dynamic programming context. We illustrate our findings on a low-dimensional and an industry-sized numerical example using real-world data, for which we derive an approximately optimal pricing policy based on our theoretical results.

Introduction

The expenditure of US households on online grocery shopping could reach $100 billion in 2022 according to the Food Marketing Institute (2018). Although growth forecasts vary and more conservative estimates lie, for example, at $30 billion for the year 2021 (Pitchbook, 2017), the overall trend is clear: The online grocery sector is likely to grow if some of its main challenges can be overcome.

One of these challenges is managing the logistics as one of the main cost-drivers. In particular, one can seek to exploit the flexibility of customers by offering delivery options at different prices to create delivery schedules that can be executed in a cost-efficient manner. To achieve this, recent proposals include giving customers the choice between narrow delivery time windows for high prices and vice versa (Campbell & Savelsbergh, 2006) or charging customers different prices based on the area and their preferred delivery time (Asdemir, Jacob, Krishnan, 2009, Yang, Strauss, 2017, Yang, Strauss, Currie, Eglese, 2016).

In this paper, we focus on the latter. We refer to the problem of finding the profit-maximising delivery slot prices as the revenue management problem in attended home delivery, where “attended” refers to the requirement that customers need to be present upon delivery of the typically perishable goods, which is in contrast to, for example, standard mail delivery. Note that attended home delivery problems are more complex than standard delivery services, since goods need to be delivered in time windows that are pre-agreed with the customers.

We adopt a dynamic programming (DP) model of an expected profit-to-go function, the value function of the DP, given the current state of orders and time left for customers to book a delivery slot. This DP was initially devised in the fashion industry (Gallego & van Ryzin, 1994), but subsequently adopted and refined by the transportation sector and the attended home delivery industry (Yang et al., 2016). This formulation could be thought of as an instance of a network revenue management problem with customer choice, which finds various applications, e.g. in transportation, hospitality and appointment scheduling problems (see Meissner, Strauss, 2012, Sauré, Patrick, Tyldesley, Puterman, 2012, Zhang, Adelman, 2009).

To find the (approximately) optimal delivery slot prices, we need to compute the value function (at least approximately) for all states and times. The main challenge is that the state space of the DP grows exponentially with the set of delivery time slots, i.e. it suffers from the “curse of dimensionality”. This means that for industry-sized problems, due to the prohibitively large number of states, the value function cannot be computed exactly, even off-line. Our ultimate objective is to compute improved value function approximations. Therefore, we study in this paper how the value function of the exact DP behaves mathematically in time and across state variables.

We show that the underlying DP operator has a unique fixed point. We then provide a closed-form expression of the resulting fixed point and derive a natural interpretation. Furthermore, we show that – under certain technical assumptions – for all time steps in the dynamic program, the value function admits a continuous extension, which is a finite-valued, concave function of its state variables.

Ultimately, our results open the road for achieving scalable implementations of the proposed formulation, as it becomes possible to make informed choices of basis functions in an approximate dynamic programming context. We illustrate our findings on a low-dimensional and an industry-sized numerical example using real-world data from a case study by Yang and Strauss (2017), for which we derive an approximate value function based on our theoretical results and a stochastic dual DP algorithm presented in Zhang and Sun (2019).

Improved value function approximations could finally be used for calculating approximately optimal delivery slot prices. For example, for continuous decision variables and under the multinomial logit customer choice model, Dong, Kouvelis, and Tian (2009) show that a unique set of optimal delivery slot prices exists, which can be found using Newton root search algorithms or using the Lambert W function as shown in B.2 if estimates of the value function are known for all states and times. Our mathematical results have immediate implications on the monotonicity of (approximately) optimal prices with respect to changes in the number of placed orders, which we also characterise in this paper. This analysis complements the research on the price-inventory relationship under multinomial logit customer choice (see e.g. Akçay, Natarajan, Xu, 2010, Chen, Chen, 2015, Suh, Aydin, 2011).

Our paper is structured as follows: In the remainder of Section 1, we introduce some notation. In Section 2, we define the revenue management problem in attended home delivery and its DP formulation. In Section 3, we present our main results, Theorem 1, which analytically characterises the fixed point of the DP, and Theorem 2, which shows that there exists a continuous extension of the value function that is a finite-valued, concave function in its state variables at every time step. Section 4 contains reformulations of the DP into mathematically more convenient forms and develops supporting results leading to the proofs of the main results. We also develop a result on the monotonicity of prices with respect to the number of placed orders. Section 5 presents a numerical illustration of our theoretical results on a low-dimensional example and on an industry-sized problem, while Section 6 concludes the paper and suggests directions for future research. The Appendix contains the proofs of results not included in the main body of the paper.

Notation: Let 1 denote a vector with all elements equal to 1. Given some s, let 1s be a vector of all zeros apart from the s-th entry, which equals 1. Furthermore, we define the convention that 10 is a vector of zeros. Let R+(+) be the non-negative (positive) real numbers, let Z be the integers and let dim(·) denote the dimension of its argument. Let conv(·) denote the convex hull of its argument. We say that a function exhibits a monotonic behaviour if the monotonicity property holds element-wise, e.g. a function f:RNR is monotonically increasing over its domain if f(y)>f(x) for all (x,y), such that at least one element of y is greater than the corresponding element of x.

Section snippets

Revenue management problem formulation

In this section, we derive a discrete-state formulation of the revenue management problem in attended home delivery.

Infinite time horizon result

We first consider the infinite horizon case, i.e. going backwards infinitely many time steps. In this scenario, we can find a fixed point of the DP described by (4) based on the following assumptions.

Assumption 1

The marginal cost of an additional, feasible order is always smaller than the maximum marginal profit, i.e. C(x+1s)C(x)d¯+r, for all (x,s)X×F(x).

Assumption 2

We assume that the transition probability density function has the following properties. For any sS:

  • (a)

    ps(d)>0, if ds[d̲,d¯].

  • (b)

    ps(d)ds=0, if ds=.

Proof of infinite time horizon theorem

To prove Theorem 1, we first note that the DP in (4) can be reformulated as a so-called stochastic shortest path problem (see Lebedev, Goulart, and Margellos, 2019, Section 4.1.1). The Bellman operator mapping of this class of problems is known to be contractive (see Bertsekas (2012, Chapters 1 and 3) and Lebedev et al. (2019, Lemma 5)). Therefore, the DP in (4) admits a unique fixed point. We start with the necessary and sufficient condition for T to have a fixed point V*, which is V*=TV*.

Numerical examples

In the following two sections, we illustrate the validity and practical utility of our results. We first show a low-dimensional numerical example of a 2-slot problem and then how our results allow the application of a non-linear stochastic dual DP algorithm to a 17-slot problem.

Summary of contributions

We have studied the mathematical properties of the value function of a dynamic program modelling the revenue management problem in attended home delivery exactly. We have shown that the recursive dynamic programming mapping has a unique, finite-valued fixed point and concavity-preserving properties. Hence, we have derived our main result stating that – under certain assumptions – for all time steps in the dynamic program, the value function admits a continuous extension, which is a

Acknowledgements

This work was supported by SIA Food Union Management. The authors are grateful for this financial support.

References (34)

  • A.M. Campbell et al.

    Incentive schemes for attended home delivery services

    Transportation Science

    (2006)
  • M. Chen et al.

    Recent developments in dynamic pricing research: Multiple products, competition, and limited demand information

    Production and Operations Management

    (2015)
  • L. Dong et al.

    Dynamic pricing and inventory control of substitute products

    Manufacturing & Service Operations Management

    (2009)
  • D.P. de Farias et al.

    The linear programming approach to approximate dynamic programming

    Operations Research

    (2003)
  • Food Marketing Institute (2018). Digital shopper. https://www.fmi.org/digital-shopper/. Accessed 4 October...
  • G. Gallego et al.

    Optimal dynamic pricing of inventories with stochastic demand over finite horizons

    Management Science

    (1994)
  • Hannah, L. A., & Dunson, D. B. (2011). Bayesian nonparametric multivariate convex regression....
  • Cited by (9)

    • Demand management for attended home delivery—A literature review

      2023, European Journal of Operational Research
    • Evaluating pricing strategies for premium delivery time windows

      2023, EURO Journal on Transportation and Logistics
    • Gradient-bounded dynamic programming for submodular and concave extensible value functions with probabilistic performance guarantees

      2022, Automatica
      Citation Excerpt :

      In this paper, we present a variant of the stochastic dual DP algorithm, termed gradient-bounded DP, for problems with discrete states and value functions that are concave extensible and submodular. One example of a problem whose value function has these properties can be found in the so-called revenue management problem in attended home delivery (Lebedev, Goulart, & Margellos, 2019, 2021). Similar to stochastic dual dynamic (integer) programming, we represent the value function of the DP as the pointwise minimum of affine functions over states.

    • Operational Research: methods and applications

      2024, Journal of the Operational Research Society
    View all citing articles on Scopus
    View full text