Elsevier

Discrete Optimization

Volume 41, August 2021, 100647
Discrete Optimization

On bin packing with clustering and bin packing with delays

https://doi.org/10.1016/j.disopt.2021.100647Get rights and content

Abstract

We continue the study of two recently introduced bin packing type problems, called bin packing with clustering, and online bin packing with delays. A bin packing input consists of items of sizes not larger than 1, and the goal is to partition or pack them into bins, where the total size of items of every valid bin cannotexceed 1.

In bin packing with clustering, items also have colors associated with them. A globally optimal solution can combine items of different colors into bins, while a clustered solution can only pack monochromatic bins. The goal is to compare a globally optimal solution to an optimal clustered solution, under certain constraints on the coloring provided with the input. We show close bounds on the worst-case ratio between these two costs, called the price of clustering, improving and simplifying previous results. Specifically, we show that the price of clustering does not exceed 1.93667, improving over the previous upper bound of 1.951, and that it is at least 1.93558, improving over the previous lower bound of 1.93344.

In online bin packing with delays, items are presented over time. Items may wait to be packed, and an algorithm can create a new bin at any time, packing a subset of already existing unpacked items into it, under the condition that the bin is valid. A created bin cannot be used again in the future, and all items have to be packed into bins eventually. The objective is to minimize the number of used bins plus the sum of waiting costs of all items, called delays. We build on previous work and modify a simple phase-based algorithm. We combine the modification with a careful analysis to improve the previously known competitive ratio from 3.951 to below 3.1551.

Introduction

In bin packing problems, a set of items I is given, where each item has a rational size in [0,1]2. The goal is to partition these items into subsets called bins, where the total size for each bin does not exceed 1. We use the term the load of a bin for the sum of sizes of its items. The process of assigning an item to a bin is called packing, and in such a case (where an item is assigned to a bin) we say that the item is packed into the bin.

We study two bin packing problems. The first problem is called bin packing with clustering. In this problem, every item has a second attribute, called a cluster index or a color. A global solution is one where items are packed without considering their clusters, i.e., it is a solution of the classic bin packing problem for this input. A clustered solution is one where every cluster or color must have its own set of bins, and items of different clusters cannot be packed into a common bin. To avoid degenerate cases, an assumption on the input is enforced. Specifically, it is assumed that every cluster is sufficiently large, and an optimal solution for each cluster has at least three bins. The problem was introduced by Azar et al. [1]. It was shown [1] that replacing this assumption with the weaker one where clusters have at least two bins makes the problem less meaningful. The goal is to compare optimal solutions, that is, to compare an optimal clustered solution to an optimal global solution, also called a globally optimal solution. We are interested in the worst-case ratio over all valid inputs, and this ratio is called price of clustering. From an algorithmic point of view, the goal is to design an approximation algorithm for which it is not allowed to mix items of different clusters, while the algorithm still has a good approximation ratio compared to a globally optimal solution. For applications of this problem in the field of massive data sets, see [1].

The results of [1] show that the price of clustering (under the assumption above) is strictly below 2, and more specifically, it is at most 1.951. The methods used to prove this are based on an auxiliary graph comparing the two different optimal solutions, and a linear program capturing the properties of worst-case inputs. A computer assisted proof was used to find an upper bound on the price of clustering. A lower bound of 1.93344 was provided as well in the same work. This problem is closely related to batched bin packing [2], [3], [4], [5]. This is a semi-online problem where items are presented in a number of batches, where every batch is to be packed before the next batch is presented. There are two variants, depending on whether bins opened for earlier batches can be used for the current batch. The variant where every batch has its own bins, and the packing is compared to an optimal (offline) one where items of different batches can still be combined into bins together is the one that is related to our work. It is mentioned in [1] that if every cluster is arbitrarily large such that its optimal cost grows to infinity, then the price of clustering decreases to approximately 1.691 (we discuss the sequence leading to this value [5], [6], [7], [8], [9], [10] in the body of the paper in a different context). In fact, this result regarding the price of clustering with very large clusters follows directly from an earlier result for batched bin packing [5].

The second problem is bin packing with delays. In this online problem, items are presented over time to be packed into bins. An algorithm can decide to create a bin at any time, which is done by selecting a subset of already existing unpacked items. The selected subset should have total size at most 1, and once its bin is created, it cannot be used again for future items. Additionally, every item i has a positive monotonically non-decreasing delay function di, and letting ti0 be the elapsed time from the arrival date of i until it is packed, the delay cost (or delay) of i is di(ti). The objective is to minimize the number of bins plus the total delay cost of all input items, and the goal is to minimize this objective. For example, if every item is assigned to a bin right when it arrives, the delays are the smallest possible, but the number of bins may be very large. On the other hand, if the algorithm waits until many items arrive and it can pack them offline, the delay costs may be very large. The problem is analyzed via the competitive ratio, which is the worst-case ratio between the cost of an online algorithm and an optimal offline solution (which still deals with the input as a sequence arriving over time, but it knows the entire sequence). Competitive algorithms should find a trade-off between waiting for additional items to arrive (such that bins will be packed densely) and the resulting delay costs of already existing items. In fact, one expects to see algorithms designed based on ski-rental type methods [11], [12], [13]. Such methods involve waiting until a certain cost is incurred before performing an action that stops the accumulation of that cost. Obviously, additional problem-specific methods are required in the design of algorithms for problems with delay costs.

Various online combinatorial optimization problems with delays were studied recently [14], [15], [16], continuing earlier studies of ski-rental type problems. Moreover, a completely different model of bin packing with delays was studied as well [17]. Offline and online bin packing are often studied with respect to asymptotic measures [18], [19], [20], [21], while here we study them via absolute measures, as in previous work on the specific problems we study, where the absolute measure is more appropriate (see [22], [23], [24], [25] for studies of bin packing with respect to absolute measures). The two problems studied here may seem unrelated; one is an offline problem and the other one is a completely different online problem. The flavor of the first problem is not algorithmic, and the algorithmic contribution is used in the analysis. The second problem is an online problem where items arrive over time, and even if one designs an offline algorithm for it, still the time axis has a major role. Since the two problems were introduced and studied in the same work [1] where properties of the first one were used in the analysis of the second one, we study them together as well. Note that we also use properties of offline bin packing for the analysis of the online problem, as we will pack subsets of items at the same time, into one bin or several bins. Bin packing with delays is a special case of the TCP acknowledgment problem [13], [26]. In this problem requests arrive over time, and should be acknowledged at times selected by the algorithm, where at every such time, all pending requests can be acknowledged. The objective is the number of acknowledgment events plus the total waiting time of all requests. Instances of this problem are instances of bin packing with delays with zero size items and delay costs based on the identity function (there is also work on more general delay functions, see for example [27]). Using the lower bound of 2 on the competitive ratio of any algorithm for TCP acknowledgment, a lower bound of 2 is known also for the competitive ratio of any algorithm for bin packing with delays.

In this work, we improve the bounds on the price of clustering, and show close bounds of 1.93667 and 1.93558. The upper bound is shown via weighting functions, while the lower bound uses a careful refinement of the previous lower bound approach, where not only clusters with items of sizes close to 12 are defined with respect to the worst-case structure but also more complicated clusters are built. We also show how the previous upper bound result can be obtained using a simple analytical proof, and we briefly discuss other versions (with larger clusters). We also generalize the previous algorithm for bin packing with delays such that its parameter can be arbitrary. Here, we apply a simple weight based analysis to obtain a better upper bound of 3.1551, while the previous bound was 3.951 [1]. Our algorithm does not require computation of optimal solutions, and whenever it packs a subset of items, this is done using a greedy algorithm, and therefore it runs in polynomial time if the delay function can be computed easily.

Section snippets

Price of clustering

In this section we study the price of clustering. Note that we consider the case where optimal costs for clusters are at least 3. Considering a parameter k1, such that the optimal cost for every cluster is at least k, the cases k=1,2 were fully analyzed and declared as uninteresting [1]. For k=1, the price of clustering is unbounded, as an input of very small items may be partitioned into clusters containing single items. For k=2, the price of clustering is 2, since every cluster may have one

Bin packing with delays

We briefly discuss assumptions on delay functions. A delay function d:[0,)[0,) is assumed to be continuous. We also assume d(0)=0 without loss of generality, as otherwise any algorithm will pay a delay of d(0) and all delay costs can be modified by subtracting the value d(0) from them. It is also assumed that the function is monotonically non-decreasing and unbounded (see below for a short discussion of the bounded case). Since a general function can be given by an oracle while algorithms

References (27)

  • EpsteinL. et al.

    On bin packing with conflicts

    SIAM J. Optim.

    (2008)
  • WoegingerG.J.

    Improved space for bounded-space online bin packing

    SIAM J. Discrete Math.

    (1993)
  • R.M. Karp, On-line algorithms versus off-line algorithms: How much is it worth to know the future? in: Proc. of the...
  • Cited by (9)

    • Open-end bin packing: New and old analysis approaches

      2022, Discrete Applied Mathematics
      Citation Excerpt :

      We do not study the case where the lower bound on the number of bins in a cluster is larger since the results will be of the same flavor as our results here and previous results, and the bounds will obviously be smaller. This model was introduced by Azar et al. [2] (see also [15]) for classic bin packing, and it was studied for another variant [16]. The goal is to compare optimal solutions, that is, to compare an optimal clustered solution to an optimal global solution, also called a globally optimal solution.

    • Online Matching with Set and Concave Delays

      2023, Leibniz International Proceedings in Informatics, LIPIcs
    • Online Matching with Delays and Stochastic Arrival Times

      2023, Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
    View all citing articles on Scopus
    1

    Partially supported by a grant from GIF — the German–Israeli Foundation for Scientific Research and Development, Israel (grant number I-1366-407.6/2016).

    View full text