A dynamic programming framework for optimal delivery time slot pricing

doi:10.1016/j.ejor.2020.11.010

European Journal of Operational Research

Volume 292, Issue 2, 16 July 2021, Pages 456-468

https://doi.org/10.1016/j.ejor.2020.11.010 Get rights and content

Highlights

•
Dynamic programming framework for revenue management in attended home delivery.
•
Characterisation of the fixed point in an infinite time horizon setting.
•
The finite time horizon value function is concave extensible and submodular.
•
Non-linear stochastic dual dynamic programming can be used to derive pricing policy.
•
High-dimensional numerical example based on real-world case study data.

Abstract

We study the dynamic programming approach to revenue management in the context of attended home delivery. We draw on results from dynamic programming theory for Markov decision problems to show that the underlying Bellman operator has a unique fixed point. We then provide a closed-form expression for the resulting fixed point and show that it admits a natural interpretation. Moreover, we also show that – under certain technical assumptions – the value function, which has a discrete domain and a continuous codomain, admits a continuous extension, which is a finite-valued, concave function of its state variables, at every time step. Furthermore, we derive results on the monotonicity of prices with respect to the number of orders placed in our setting. These results open the road for achieving scalable implementations of the proposed formulation, as it allows making informed choices of basis functions in an approximate dynamic programming context. We illustrate our findings on a low-dimensional and an industry-sized numerical example using real-world data, for which we derive an approximately optimal pricing policy based on our theoretical results.

Introduction

The expenditure of US households on online grocery shopping could reach $100 billion in 2022 according to the Food Marketing Institute (2018). Although growth forecasts vary and more conservative estimates lie, for example, at $30 billion for the year 2021 (Pitchbook, 2017), the overall trend is clear: The online grocery sector is likely to grow if some of its main challenges can be overcome.

One of these challenges is managing the logistics as one of the main cost-drivers. In particular, one can seek to exploit the flexibility of customers by offering delivery options at different prices to create delivery schedules that can be executed in a cost-efficient manner. To achieve this, recent proposals include giving customers the choice between narrow delivery time windows for high prices and vice versa (Campbell & Savelsbergh, 2006) or charging customers different prices based on the area and their preferred delivery time (Asdemir, Jacob, Krishnan, 2009, Yang, Strauss, 2017, Yang, Strauss, Currie, Eglese, 2016).

In this paper, we focus on the latter. We refer to the problem of finding the profit-maximising delivery slot prices as the revenue management problem in attended home delivery, where “attended” refers to the requirement that customers need to be present upon delivery of the typically perishable goods, which is in contrast to, for example, standard mail delivery. Note that attended home delivery problems are more complex than standard delivery services, since goods need to be delivered in time windows that are pre-agreed with the customers.

We adopt a dynamic programming (DP) model of an expected profit-to-go function, the value function of the DP, given the current state of orders and time left for customers to book a delivery slot. This DP was initially devised in the fashion industry (Gallego & van Ryzin, 1994), but subsequently adopted and refined by the transportation sector and the attended home delivery industry (Yang et al., 2016). This formulation could be thought of as an instance of a network revenue management problem with customer choice, which finds various applications, e.g. in transportation, hospitality and appointment scheduling problems (see Meissner, Strauss, 2012, Sauré, Patrick, Tyldesley, Puterman, 2012, Zhang, Adelman, 2009).

To find the (approximately) optimal delivery slot prices, we need to compute the value function (at least approximately) for all states and times. The main challenge is that the state space of the DP grows exponentially with the set of delivery time slots, i.e. it suffers from the “curse of dimensionality”. This means that for industry-sized problems, due to the prohibitively large number of states, the value function cannot be computed exactly, even off-line. Our ultimate objective is to compute improved value function approximations. Therefore, we study in this paper how the value function of the exact DP behaves mathematically in time and across state variables.

We show that the underlying DP operator has a unique fixed point. We then provide a closed-form expression of the resulting fixed point and derive a natural interpretation. Furthermore, we show that – under certain technical assumptions – for all time steps in the dynamic program, the value function admits a continuous extension, which is a finite-valued, concave function of its state variables.

Ultimately, our results open the road for achieving scalable implementations of the proposed formulation, as it becomes possible to make informed choices of basis functions in an approximate dynamic programming context. We illustrate our findings on a low-dimensional and an industry-sized numerical example using real-world data from a case study by Yang and Strauss (2017), for which we derive an approximate value function based on our theoretical results and a stochastic dual DP algorithm presented in Zhang and Sun (2019).

Improved value function approximations could finally be used for calculating approximately optimal delivery slot prices. For example, for continuous decision variables and under the multinomial logit customer choice model, Dong, Kouvelis, and Tian (2009) show that a unique set of optimal delivery slot prices exists, which can be found using Newton root search algorithms or using the Lambert $W$ function as shown in B.2 if estimates of the value function are known for all states and times. Our mathematical results have immediate implications on the monotonicity of (approximately) optimal prices with respect to changes in the number of placed orders, which we also characterise in this paper. This analysis complements the research on the price-inventory relationship under multinomial logit customer choice (see e.g. Akçay, Natarajan, Xu, 2010, Chen, Chen, 2015, Suh, Aydin, 2011).

Our paper is structured as follows: In the remainder of Section 1, we introduce some notation. In Section 2, we define the revenue management problem in attended home delivery and its DP formulation. In Section 3, we present our main results, Theorem 1, which analytically characterises the fixed point of the DP, and Theorem 2, which shows that there exists a continuous extension of the value function that is a finite-valued, concave function in its state variables at every time step. Section 4 contains reformulations of the DP into mathematically more convenient forms and develops supporting results leading to the proofs of the main results. We also develop a result on the monotonicity of prices with respect to the number of placed orders. Section 5 presents a numerical illustration of our theoretical results on a low-dimensional example and on an industry-sized problem, while Section 6 concludes the paper and suggests directions for future research. The Appendix contains the proofs of results not included in the main body of the paper.

Notation: Let $1$ denote a vector with all elements equal to 1. Given some $s,$ let $1_{s}$ be a vector of all zeros apart from the $s$ -th entry, which equals 1. Furthermore, we define the convention that $1_{0}$ is a vector of zeros. Let $R_{+ (+)}$ be the non-negative (positive) real numbers, let $Z$ be the integers and let dim $(\cdot)$ denote the dimension of its argument. Let conv $(\cdot)$ denote the convex hull of its argument. We say that a function exhibits a monotonic behaviour if the monotonicity property holds element-wise, e.g. a function $f : R^{N} \mapsto R$ is monotonically increasing over its domain if $f (y) > f (x)$ for all $(x, y),$ such that at least one element of $y$ is greater than the corresponding element of $x$ .

Section snippets

Revenue management problem formulation

In this section, we derive a discrete-state formulation of the revenue management problem in attended home delivery.

Infinite time horizon result

We first consider the infinite horizon case, i.e. going backwards infinitely many time steps. In this scenario, we can find a fixed point of the DP described by (4) based on the following assumptions.

Assumption 1

The marginal cost of an additional, feasible order is always smaller than the maximum marginal profit, i.e. $C (x + 1_{s}) - C (x) \leq \bar{d} + r,$ for all $(x, s) \in X \times F (x)$ .

Assumption 2

We assume that the transition probability density function has the following properties. For any $s \in S$ :

(a)
$p_{s} (d) > 0,$ if $d_{s} \in [\underset{̲}{d}, \bar{d}]$ .
(b)
$p_{s} (d) d_{s} = 0,$ if $d_{s} = \infty$ .

Proof of infinite time horizon theorem

To prove Theorem 1, we first note that the DP in (4) can be reformulated as a so-called stochastic shortest path problem (see Lebedev, Goulart, and Margellos, 2019, Section 4.1.1). The Bellman operator mapping of this class of problems is known to be contractive (see Bertsekas (2012, Chapters 1 and 3) and Lebedev et al. (2019, Lemma 5)). Therefore, the DP in (4) admits a unique fixed point. We start with the necessary and sufficient condition for $T$ to have a fixed point $V^{*},$ which is $V^{*} = T V^{*}$ .

Numerical examples

In the following two sections, we illustrate the validity and practical utility of our results. We first show a low-dimensional numerical example of a 2-slot problem and then how our results allow the application of a non-linear stochastic dual DP algorithm to a 17-slot problem.

Summary of contributions

We have studied the mathematical properties of the value function of a dynamic program modelling the revenue management problem in attended home delivery exactly. We have shown that the recursive dynamic programming mapping has a unique, finite-valued fixed point and concavity-preserving properties. Hence, we have derived our main result stating that – under certain assumptions – for all time steps in the dynamic program, the value function admits a continuous extension, which is a

Acknowledgements

This work was supported by SIA Food Union Management. The authors are grateful for this financial support.

References (34)

K. Asdemir et al.
Dynamic pricing of multiple home delivery options
European Journal of Operational Research
(2009)
Pitchbook (2017). Are meal-kit delivery companies a threat or an opportunity?...
W.B. Powell
Approximate dynamic programming: solving the curses of dimensionality (Wiley series in probability and statistics)
(2007)
X. Yang et al.
An approximate dynamic programming approach to attended home delivery management
European Journal of Operational Research
(2017)
X. Yang et al.
Choice-based demand management and vehicle routing in e-fulfillment
Transportation Science
(2016)
J. Zou et al.
Stochastic dual dynamic integer programming
Mathematical Programming
(2019)
Y. Akçay et al.
Joint dynamic pricing of multiple perishable products under consumer choice
Management Science
(2010)
D.P. Bertsekas
Dynamic Programming and Optimal Control, Vol. II
(2012)
D. Bertsimas et al.
Optimization over Integers
(2005)
S. Boyd et al.
Convex Optimization
(2004)

A.M. Campbell et al.

Incentive schemes for attended home delivery services

Transportation Science

(2006)

M. Chen et al.

Recent developments in dynamic pricing research: Multiple products, competition, and limited demand information

Production and Operations Management

(2015)

L. Dong et al.

Dynamic pricing and inventory control of substitute products

Manufacturing & Service Operations Management

(2009)

D.P. de Farias et al.

The linear programming approach to approximate dynamic programming

Operations Research

(2003)

Food Marketing Institute (2018). Digital shopper. https://www.fmi.org/digital-shopper/. Accessed 4 October...

G. Gallego et al.

Optimal dynamic pricing of inventories with stochastic demand over finite horizons

Management Science

(1994)

Hannah, L. A., & Dunson, D. B. (2011). Bayesian nonparametric multivariate convex regression....

Cited by (9)

Demand management for attended home delivery—A literature review
2023, European Journal of Operational Research
Given the continuing e-commerce boom, the design of efficient and effective home delivery services is increasingly relevant. From a logistics perspective, attended home delivery, which requires the customer to be present when the purchased goods are delivered, is particularly challenging. To facilitate the delivery, the service provider and the customer typically agree on a specific time window for service. In designing the service offering, service providers face complex trade-offs between customer preferences and profitable service execution. In this paper, we map these trade-offs to different planning levels and demand management levers, and structure and synthesize corresponding literature according to different demand management decisions. Finally, we highlight research gaps and future research directions and discuss the linkage of the different planning levels.
Dynamic demand management and online tour planning for same-day delivery
2023, European Journal of Operational Research
For providers to stay competitive in a context of continued growth in e-retail sales and increasing customer expectations, same-day delivery options have become very important. Typically, with same-day delivery, customers purchase online and expect to receive their ordered goods within a narrow delivery time span. Providers thus experience substantial operational challenges to run profitable tours and generate sufficiently high contribution margins to cover overhead costs. We address these challenges by combining a demand-management approach with an online tour-planning approach for same-day delivery. More precisely, in order to reserve capacity for high-value customer orders and to guide customer choices toward efficient delivery operations, we propose a demand-management approach that explicitly optimizes the combination of delivery spans and prices which are presented to each incoming customer request. The approach includes an anticipatory sample-scenario based value approximation, which incorporates a direct online tour-planning heuristic. It does not require extensive offline learning and is scalable to realistically sized instances with multiple vehicles. In a comprehensive computational study, we show that our anticipatory approach can improve the contribution margin by up to 50% compared to a myopic benchmark approach. We also show that solving an explicit pricing optimization problem is a beneficial component of our approach. More precisely, it outperforms both a pure availability control and a simple pricing rule based on opportunity costs. The latter idea is one used in other approaches for related dynamic pricing problems dealt with in the literature.
Evaluating pricing strategies for premium delivery time windows
2023, EURO Journal on Transportation and Logistics
In the challenging environment of attended home deliveries, pricing delivery options can play a crucial role to ensure profitability and service quality of retailers. To differentiate between standard and premium delivery options, many retailers include time windows of various lengths and fees within their offer sets. Even though customers prefer short delivery time windows, longer time windows can help maintaining flexibility and profitability for the retailer. We classify pricing strategies along two dimensions: static versus dynamic price setting and whether an offer set can include one or multiple price points. For static pricing, we implement price configurations that reflect current business practice. For the dynamic pricing, we adapt routing mechanisms that consider the flexibility within the underlying route plan during the booking process and set delivery fees accordingly. To evaluate the pricing strategies under plausibly realistic conditions, we model customer behavior through a nested logit model. This model represents customer choice as sequential decisions between premium and standard time windows. We perform a computational study considering realistic travel and demand data to investigate the effectiveness of static and dynamic time window pricing. Finally, we offer managerial insights and an outlook into applying strategic analysis to decide on price setting strategies.
Gradient-bounded dynamic programming for submodular and concave extensible value functions with probabilistic performance guarantees
2022, Automatica
Citation Excerpt :
In this paper, we present a variant of the stochastic dual DP algorithm, termed gradient-bounded DP, for problems with discrete states and value functions that are concave extensible and submodular. One example of a problem whose value function has these properties can be found in the so-called revenue management problem in attended home delivery (Lebedev, Goulart, & Margellos, 2019, 2021). Similar to stochastic dual dynamic (integer) programming, we represent the value function of the DP as the pointwise minimum of affine functions over states.
We consider stochastic dynamic programming problems with high-dimensional, discrete state-spaces and finite, discrete-time horizons that prohibit direct computation of the value function from a given Bellman equation for all states and time steps due to the “curse of dimensionality”. For the case where the value function of the dynamic program is concave extensible and submodular in its state-space, we present a new algorithm that computes deterministic upper and stochastic lower bounds of the value function in the realm of dual dynamic programming. We show that the proposed algorithm terminates after a finite number of iterations. Furthermore, we derive probabilistic guarantees on the value accumulated under the associated policy for a single realisation of the dynamic program and for the expectation of this value. Finally, we demonstrate the efficacy of our approach on a high-dimensional numerical example from delivery slot pricing in attended home delivery.
Operational Research: methods and applications
2024, Journal of the Operational Research Society
A survey of attended home delivery and service problems with a focus on applications
2023, 4OR

View all citing articles on Scopus

View full text

Discrete OptimisationA dynamic programming framework for optimal delivery time slot pricing

Highlights

Abstract

Introduction

Section snippets

Revenue management problem formulation

Infinite time horizon result

Proof of infinite time horizon theorem

Numerical examples

Summary of contributions

Acknowledgements

European Journal of Operational Research

European Journal of Operational Research

Transportation Science

Mathematical Programming

Joint dynamic pricing of multiple perishable products under consumer choice

Management Science

Dynamic Programming and Optimal Control, Vol. II

Optimization over Integers

Convex Optimization

Incentive schemes for attended home delivery services

Transportation Science

Recent developments in dynamic pricing research: Multiple products, competition, and limited demand information

Production and Operations Management

Dynamic pricing and inventory control of substitute products

Manufacturing & Service Operations Management

The linear programming approach to approximate dynamic programming

Operations Research

Optimal dynamic pricing of inventories with stochastic demand over finite horizons

Management Science

Discrete Optimisation
A dynamic programming framework for optimal delivery time slot pricing