1 Introduction

In the Minimum Weight Hypergraph Vertex Cover (mwhvc) problem, we are given a hypergraph \(G=(V,E)\) with vertex weights \(w:V\rightarrow \left\{ 1,\ldots ,W \right\} \).Footnote 1 The goal is to find a minimum weight cover \(U\subseteq V\) such that \(\forall e\in E: e\cap U\ne \emptyset \). In this paper we develop a distributed approximation algorithm for mwhvc in the congest model. The approximation ratio is \(f+\varepsilon \), where f denotes the rank of the hypergraph (i.e., f is an upper bound on the size of every hyperedge). The mwhvc problem is a generalization of the Minimum Weight Vertex Cover (mwvc) problem (in which \(f=2\)). The mwhvc problem is also equivalent to the Minimum Weight Set Cover Problem (the rank f of the hypergraph corresponds to the maximum frequency of an element). Both of these problems are among the classical NP-hard problems presented in [14].

We consider the following distributed setting for the mwhvc problem. The communication network is a bipartite graph \(H(E\cup V,\left\{ \left\{ e,v \right\} \mid v\in e \right\} )\). We refer to the network vertices as nodes and network edges as links. The nodes of the network are the hypergraph vertices on one side and hyperedges on the other side. There is a network link between vertex \(v\in V\) and hyperedge \(e\in E\) iff \(v\in e\). The computation is performed in synchronous rounds, where messages are sent between neighbors in the communication network. As for message size, we consider the congest model where message sizes are bounded to \(O(\log |V|)\). This is more restrictive than the local model where message sizes are unbounded.

Table 1 Previous distributed algorithms for mwvc
Table 2 Previous distributed algorithms for mwhvc

1.1 Related work

We survey previous results for mwhvc and mwvc. A comprehensive list of previous results appears in Tables 1 and 2.

Vertex Cover. The understanding of the round complexity for distributed mwvc has been established in two papers: a lower bound in [19] and a matching upper bound in [4]. Let \(\varDelta \) denote the maximum vertex degree in the graph G. The lower bound states that any distributed constant-factor approximation algorithm requires \(\varOmega (\log \varDelta / \log \log \varDelta )\) rounds to terminate. This lower bound holds for every constant approximation ratio, for unweighted graphs and even if the message lengths are not bounded (i.e., LOCAL model) [19]. The matching upper bound is a \((2+\varepsilon )\)-approximation distributed algorithm in the congest model, for every \(\varepsilon =\varOmega (\log \log \varDelta / \log \varDelta )\).Footnote 2 In [16] an \(O(\log n)\)-round 2-approximation randomized algorithm for weighted graphs in the congest model is given. We note that [16] was the first to achieve this running time with no dependence on W, the maximum weight of the nodes. Recently, Ben-Basat et al. [6] showed an \(O(OPT^2\log OPT)\) rounds algorithm for computing the minimal (unweighted) vertex cover and a O(OPT) rounds for a \((2+\varepsilon )\)-approximation. Here, OPT is the size of the smallest cover and thus these algorithms are adequate when a small solution exists.

Hypergraph Weighted Vertex Cover. For constant values of f, Astrand et al. [2] present an f-approximation algorithm for anonymous networks whose running time is \(O(\varDelta ^2 + \varDelta \cdot \log ^* W)\). Khuller et al. [15] provide a solution that runs in \(O(f \cdot \log \varepsilon ^{-1} \cdot \log n)\) rounds in the congest model for any \(\varepsilon > 0\) and achieves an \((f+\varepsilon )\)-approximation. Setting \(\varepsilon = 1/W\) (recall that \(W=\hbox {poly}(n)\)) results in a f-approximation in \(O(f\log ^2 n)\)-rounds. For constant \(\varepsilon \) and f values, Kuhn et al. [17, 18] present an \((f+\varepsilon )\)-approximation algorithm that terminates in \(O(\log \varDelta + \log W)\) rounds.

For the Minimum Cardinality (i.e., unweighted) Vertex Cover in Hypergraphs Problem, the lower bound was recently matched by [9] with an \((f+\varepsilon )\)-approximation algorithm in the congest model. The round complexity in [9] is \(O\left( f/\varepsilon \cdot \frac{\log (f\cdot \varDelta )}{\log \log (f\cdot \varDelta )}\right) \), which is optimal for constant f and \(\varepsilon \). The algorithm in [9] and its analysis is a deterministic version of the randomized maximal independent set algorithm of [10].

1.2 Our contributions

In this paper, we present a deterministic distributed \((f+\varepsilon )\)-approximation algorithm for minimum weight vertex cover in f-rank hypergraphs, which completes in

$$\begin{aligned}&O\left( (1+f/\log n)\cdot \left( \frac{\log \varDelta }{ \log \log \varDelta } +\right. \right. \nonumber \\&\quad \left. \left. ({f\cdot \log M})^{1.01}\cdot \log \varepsilon ^{-1}\cdot (\log \varDelta )^{0.01}\right) \right) \end{aligned}$$

rounds in the congest model. For any constants \(\varepsilon \in (0,1)\) and \(f\in {\mathbb {N}}^+\) this implies a running time of \(O(\log {\varDelta } / \log \log \varDelta )\), which is optimal according to [19]. This is the first distributed algorithm for this problem whose round complexity does not depend on the node weights nor the number of vertices.

Our algorithm is one of a handful of distributed approximation algorithms for local problems which are provably optimal [3, 4, 7,8,9, 11]. Among these are the classic Cole-Vishkin algorithm [8] for 3-coloring a ring, the more recent results of [3] and [4] for mwvc and Maximum Matching, and the result of [9] for Minimum Cardinality Hypergraph Vertex Cover.

Our algorithm also achieves a deterministic f-approximation for the problem in \(O(f\log n)\) rounds. This improves over the best known deterministic result for hypergraphs \(O(f\log ^2 n)\) [15] and matches the best known randomized results for weighted vertex cover (\(f=2\)) of \(O(\log n)\)-rounds [16].

We also show that general covering Integer Linear Programs (ILPs) can be reduced to mwhvc in the distributed setting. That is, LP constraints can be translated into hyperedges such that a cover for the hyperedges satisfies all covering constraints. This allows us to achieve an

$$\begin{aligned}&O\left( (1+f/\log n)\cdot \left( \frac{\log \varDelta }{ \log \log \varDelta } +\right. \right. \nonumber \\&\quad \left. \left. ({f\cdot \log M})^{1.01}\cdot \log \varepsilon ^{-1}\cdot (\log \varDelta )^{0.01}\right) \right) \end{aligned}$$

rounds \((f\lceil \log _2(M)+1 \rceil +\varepsilon )\)-approximate integral solution, where f bounds the number of variables in a constraint, \(\varDelta \) bounds the number of constraints a variable appears in, and \(M=\max \left\{ 1, \lceil 1/a_{\min } \rceil \right\} \), where \(a_{\min }\) is the smallest normalized constraint coefficient.

1.3 Tools and techniques

The Primal-Dual schema. The Primal-Dual approach introduces, for every hyperedge \(e \in E\), a dual variable denoted by \(\delta (e)\). The dual edge packing constraints are \(\forall v\in V,\sum _{v \in e} \delta (e) \le w(v)\). If for some \(\beta \in [0,1)\) it holds that \(\sum _{v \in e} \delta (e) \ge (1-\beta )\cdot w(v)\), we say the node v is \(\beta \)-tight. Let \(\beta =\varepsilon /(f+\varepsilon )\). For every feasible dual solution, the weight of the set of \(\beta \)-tight vertices is at most \((f+\varepsilon )\) times the weight of an optimal (fractional) solution. The algorithm terminates when the set of \(\beta \)-tight edges constitutes a vertex cover.

The fractional LP relaxation of mwhvc is defined as follows.

figure a

The dual LP is an Edge Packing problem defined as follows:

figure b

The following claim is used for proving the approximation ratio of the mwhvc algorithm.

Claim 1

Let \(\textsf {opt}\) denote the value of an optimal fractional solution of the primal LP (\(\mathcal {P}\)). Let \(\{\delta (e)\}_{e\in E}\) denote a feasible solution of the dual LP (\(\mathcal {D}\)). Let \(\varepsilon \in (0,1)\) and \(\beta \triangleq \varepsilon /(f+\varepsilon )\). Define the \(\beta \)-tight vertices by \( T_{\varepsilon } \triangleq \{ v\in V \mid \sum _{e\ni v} \delta (e)\ge (1-\beta )\cdot w(v)\}. \)

Then \(w(T_{\varepsilon })\le (f+\varepsilon ) \cdot \textsf {opt}\).

Proof

$$\begin{aligned} w(T_{\varepsilon }) =\sum _{v\in T_{\varepsilon }} w(v)&\le \frac{1}{1-\beta } \cdot \left( \sum _{v\in T_{\varepsilon }} \sum _{e\ni v} \delta (e)\right) \\&\le \frac{f}{1-\beta } \sum _{e\in E} \delta (e) \le (f+\varepsilon )\cdot \textsf {opt}. \end{aligned}$$

The last transition follows from \(f/(1-\beta )=f+\varepsilon \) and by weak duality. The claim follows. \(\square \)

The challenge. When designing a Primal-Dual distributed algorithm, the main challenge is in controlling the rate at which we increase the dual variables. On the one hand, we must grow them rapidly to reduce the number of communication rounds. On the other hand, we may not violate the edge packing constraints. This is tricky in the distributed environments as we have to coordinate between nodes. For example, the result of [4] does not generalize to hypergraphs, as hyperedges require the coordination of more than two nodes in order to increment edge variables.

Our algorithm. The algorithm proceeds in iterations, each of which requires a constant number of communication rounds. We initialize the dual variables in a “safe” way so that feasibility is guaranteed. We refer to the additive increase of the dual variable \(\delta (e)\) as \({{\,\mathrm{bid}\,}}(e)\). Similarly to [5] (where bids are called deals), we use levels to measure the progress made by a vertex. Whenever the level of a vertex increases, it sends a message about it to all incident edges, which multiply (decrease) their bids by 0.5. Intuitively, the level of a vertex equals the logarithm of its uncovered portion. Formally, we define the level of a vertex v as \(\ell (v)\triangleq \left\lfloor \log \frac{w(v)}{w(v)-\sum _{e\ni v} \delta (e)} \right\rfloor \). That is, the initial level of v is 0 and it is increased as the dual variables of the edges incident to v grow. The level of a vertex never reaches \(z\triangleq \lceil \log \beta ^{-1} \rceil \) as this implies that it is \(\beta \)-tight and entered the cover. Loosely speaking, the algorithm increases the increments \({{\,\mathrm{bid}\,}}(e)\) exponentially (multiplication by \(\alpha \)) provided that no vertex \(v\in e\) is \((0.5^{\ell (v)} / \alpha )\)-tight with respect to the bids of the previous iteration. Here, \(\alpha \ge 2\) is a positive parameter that we determine later. This is achieved by checking if all the vertices in e can increase the dual variables by an \(\alpha \) factor while staying at the current level. Namely, each vertex either sends a “raise” message, that signals that an \(\alpha \)-factor increase in the dual variables of all its adjacent edges would keep it in its current level, or a “stuck” message otherwise. \({{\,\mathrm{bid}\,}}(e)\) is then increased only if all the edge’s vertices sent a “raise” message. The analysis builds on two observations: (1) The number of times that the increment \({{\,\mathrm{bid}\,}}(e)\) is multiplied by \(\alpha \) is bounded by \(\log _\alpha \varDelta \). (2) The number of iterations in which \({{\,\mathrm{bid}\,}}(e)\) is not multiplied by \(\alpha \) is bounded by \(O(f\cdot z\cdot \alpha )\). Loosely speaking, each such iteration means that for at least one vertex \(v\in e\) the sum of bids is at least an \(1/(2\alpha )\)-fraction of its slack. Therefore, after at most \(O(\alpha )\) such iterations that vertex will level up. Since there are z levels per vertex and f vertices in e, we have that the number of such iterations is at most \(O(f\cdot z\cdot \alpha )\). Hence the total number of iterations is bounded by \(O(\log _\alpha \varDelta +f\cdot z\cdot \alpha )\).

Integer linear programs (ILPs).   We show distributed reductions that allow us to compute an \((f+\varepsilon )\)-approximation for general covering integer linear programs (ILPs). To that end, we first show that any Zero-One covering program (where all variables are binary) can be translated into a set cover instance in which the vertex degree is bounded by \(2^f\) times the bound on the number of constraints each ILP variable appears in. We then generalize to arbitrary covering ILPs by replacing each variable with multiple vertices in the hypergraph, such that the value associated with the ILP variable will be the weighted sum of the vertices in the cover.

2 Problem formulation

Let \(G=(V,E)\) denote a hypergraph. Vertices in V are equipped with positive integer weights w(v). For a subset \(U\subseteq V\), let \(w(U)\triangleq \sum _{v\in U} w(v)\). Let E(U) denote the set of hyperedges that are incident to some vertex in U (i.e., \(E(U)\triangleq \{ e\in E \mid e\cap U \ne \emptyset \}\)).

The Minimum Weight Hypergraph Vertex Cover Problem (mwhvc) is equivalent to the Weighted Set Cover Problem and is defined as follows.

Consider a set system \((X,\mathcal {U})\), where X denotes a set of elements and \(\mathcal {U}=\left\{ U_1,\ldots , U_m \right\} \) denotes a collection of subsets of X. The reduction from the set system \((X,\mathcal {U})\) to a hypergraph \(G=(V,E)\) proceeds as follows. The set of vertices is \(V\triangleq \left\{ u_1,\ldots , u_m \right\} \) (one vertex \(u_i\) per subset \(U_i\)). The set of edges is \(E\triangleq \left\{ e_x \right\} _{x\in X}\) (one hyperedge \(e_x\) per element x), where \(e_x\triangleq \left\{ u_i: x\in U_i \right\} \). The weight of vertex \(u_i\) equals the weight of the subset \(U_i\).

We now detail the setting for computing a distributed \((f+\varepsilon )\)-approximation of the problem.

Input. The input is a hypergraph \(G=(V,E)\) with non-negative vertex weights \(w:V\rightarrow \mathbb {N}^+\) and an approximation ratio parameter \(\varepsilon \in (0,1]\). We denote the number of vertices by n, the rank of G by f (i.e., each hyperedge contains at most f vertices), and the maximum degree of G by \(\varDelta \) (i.e., each vertex belongs to at most \(\varDelta \) edges).

Assumption 1

We assume that (i)Vertex weights are polynomial in \(n\triangleq |V|\) so that sending a vertex weight requires \(O(\log n)\) bits. (ii)Vertex degrees are polynomial in n (i.e., \(|E(v)|=n^{O(1)}\)) so that sending a vertex degree requires \(O(\log n)\) bits. Since \(|E(v)|\le n^f\), this assumption trivially holds for constant f. (iii)The maximum degree is at least 3 so that \(\log \log \varDelta >0\). (iv)All vertices know f (or an estimate \({\widehat{f}}=\varTheta (f)\)). For getting an f- (but not for an \(f+\varepsilon )-\)approximation, the vertices also need to know n and W. For simplicity, we assume that \(\varDelta \) is known, but later discuss how to remove this assumption.

Output. The nodes compute a vertex cover \(C\subseteq V\). Namely, for every hyperedge \(e\in E\), the intersection \(e\cap C\) is not empty. The set C is maintained locally in the sense that every vertex v knows whether it belongs to C or not.

Communication Network. The communication network \(N(E\cup V,\left\{ \left\{ e,v \right\} \mid v\in e \right\} )\) is a bipartite graph. There are two types of nodes in the network: servers and clients. The set of servers is V (the vertex set of G) and the set of clients is E (the hyperedges in G). There is a link (ve) from server \(v\in V\) to a client \(e\in E\) if \(v\in e\). We note that the degree of the clients is bounded by f and the degree of the servers is bounded by \(\varDelta \).

Notation.

  • We say that an edge e is covered by C if \(e\cap C\ne \emptyset \).

  • Let \(E(v)\triangleq \{ e\in E\mid v\in e\}\) denote the set of hyperedges that contain v.

  • For every vertex v, the algorithm maintains a subset \(E'(v)\subseteq E(v)\) that consists of the uncovered hyperedges in E(v) (i.e., \(E'(v)=\{e\in E(v)\mid e\cap C=\emptyset \}\)).

Invariants. Throughout its execution, as we prove in the analysis in Sect. 4.1, the algorithm maintains the following invariants at the end of each iteration:

  • The level of each vertex which did not terminate correctly measures its tightness, i.e., \(\ell (v)= \left\lfloor \log \frac{w(v)}{w(v)-\sum _{e\ni v} \delta (e)} \right\rfloor \).

  • The dual variables constitute a feasible edge packing. Namely,

    $$\begin{aligned} \sum _{e\in E(v)} \delta _i(e)&\le w(v)&\text {for every vertex }v\in V,\\ \delta _i(e)&\ge 0&\text {for every edge }e\in E. \end{aligned}$$

3 Distributed approximation algorithm for MWHVC

3.1 Parameters and variables

  • The algorithm computes an \((f+\varepsilon )\)-approximation where \(\varepsilon \in (0,1]\). The parameter \(\beta \) is defined by \(\beta \triangleq \varepsilon /(f+\varepsilon )\), where f is the rank of the hypergraph.

  • Each vertex v is assigned a level \(\ell (v)\) which is a nonnegative integer.

  • We denote the dual variables at the end of iteration i by \(\delta _i(e)\) (see Sect. 1.3 for a description of the dual edge packing linear program). The amount by which \(\delta _i(e)\) is increased in iteration i is denoted by \({{\,\mathrm{bid}\,}}_i(e)\). Namely, \(\delta _i(e)=\sum _{j\le i}{{\,\mathrm{bid}\,}}_j(e)\).

  • The parameter \(\alpha \ge 2\) determines the factor by which bids are multiplied. We determine its value in the analysis in the following section.

3.2 Algorithm MWHVC

  1. 1.

    Initialization. Set \(C\leftarrow \emptyset \). For every vertex v, set level \(\ell (v)\leftarrow 0\) and uncovered edges \(E'(v)\leftarrow E(v)\).

  2. 2.

    Iteration \(i=0\). Every edge e collects the weight w(v) and degree |E(v)| from every vertex \(v\in e\), and sets: \({{\,\mathrm{bid}\,}}(e)=0.5 \cdot \min _{v\in e}\{ w(v)/|E(v)|\)}. The value \({{\,\mathrm{bid}\,}}(e)\) is sent to every \(v\in e\). The dual variable is set to \(\delta (e)\leftarrow {{\,\mathrm{bid}\,}}(e)\).

  3. 3.

    For \(i=1\) to \(\infty \) do:

  4. (a)

    Check \(\beta \)-tightness. For every \(v\not \in C\), if \(\sum _{e\in E(v)} \delta (e) \ge (1-\beta )w(v)\), then v joins the cover C, sends a message to every \(e\in E'(v)\) that e is covered, and vertex v terminates.

  5. (b)

    For every uncovered edge e, if e receives a message that it is covered, then it tells all its vertices that e is covered and terminates.

  6. (c)

    For every vertex \(v\notin C\), if it receives a message from e that e is covered, then \(E'(v)\leftarrow E'(v)\setminus \{e\}\). If \(E'(v)=\emptyset \), then v terminates (without joining the cover).

  7. (d)

    Increment levels and scale bids. For every active (that has not terminated) vertexFootnote 3:

    While \(\sum _{e\in E(v)} \delta (e) > w(v)(1-0.5^{\ell (v)+1})\) do

  8. (i)

    \(\ell (v)\leftarrow \ell (v)+1\)

  9. (ii)

    For every \(e\in E'(v)\): \({{\,\mathrm{bid}\,}}(e)\leftarrow 0.5\cdot {{\,\mathrm{bid}\,}}(e)\)

  10. (e)

    For every active vertex, if \(\sum _{e\in E'(v)} {{\,\mathrm{bid}\,}}(e) \le \frac{1}{\alpha }\cdot 0.5^{\ell (v)+1} \cdot w(v)\), then send the message “raise” to every \(e\in E'(v)\); otherwise, send the message “stuck” to every \(e\in E'(v)\).

  11. (f)

    For every uncovered edge e, if all incoming messages are “raise”, set \({{\,\mathrm{bid}\,}}(e)\leftarrow \alpha \cdot {{\,\mathrm{bid}\,}}(e)\).

  12. (g)

    Send \({{\,\mathrm{bid}\,}}(e)\) to every \(v\in e\) (so it can track \(\delta (e)\)).

  • *Termination. Every vertex v terminates when either \(v\in C\) or every edge \(e\in E(v)\) is covered (i.e., \(E'(v)=\emptyset \)). Every edge e terminates when it is covered (i.e., \(e\cap C\ne \emptyset \)). We say that the algorithm has terminated if all the vertices and edges have terminated.

  • *Execution in congest. See Section A in the Appendix for a discussion of how Algorithm mwhvc is executed in the congest model.

4 Algorithm analysis

In this section, we analyze the approximation ratio and the running time of the algorithm. Throughout the analysis, we attach an index i to the variables \({{\,\mathrm{bid}\,}}_i(e), \delta _i(e)\) and \(\ell _i(v)\). The indexed variable refers to its value at the end of the i’th iteration.

4.1 Feasibility and approximation ratio

The following invariants are satisfied throughout the execution of the algorithm. In the following claim we bound the sum of the bids of edges incident to a vertex.

Claim 2

If \(v\not \in C\) at the end of iteration i, then

$$\begin{aligned} \sum _{e\in E'(v)} {{\,\mathrm{bid}\,}}{}_{i} (e) \le 0.5^{\ell _i(v)+1}\cdot w(v). \end{aligned}$$

Proof

The proof is by induction on i. For \(i=0\), the claim holds because \(\ell _0(v)=0\) and \({{\,\mathrm{bid}\,}}_0(e)\le 0.5\cdot w(v)/|E(v)|\). The induction step, for \(i\ge 1\), considers two cases. (A) If v sends a “raise” message in iteration i, then Step 3e implies that \(\sum _{e\in E'(v)} {{\,\mathrm{bid}\,}}_{i}(e)\le 0.5^{\ell (v)+1}\cdot w(v)\), as required. (B)  Suppose v sends a “stuck” message in iteration i. By Step 3d, \({{\,\mathrm{bid}\,}}_i(e)\le 0.5^{\ell _i(v)-\ell _{i-1}(v)}\cdot {{\,\mathrm{bid}\,}}_{i-1}(e)\) for every \(e\in E'(v)\). The induction hypothesis states that \(\sum _{e\in E'(v)} {{\,\mathrm{bid}\,}}_{i-1} (e) \le 0.5^{\ell _{i-1}(v)+1}\cdot w(v)\). The claim follows by combining these inequalities. \(\square \)

If an edge e is covered in iteration j, then e terminates and \(\delta _i(e)\) is not set for \(i\ge j\). In this case, we define \(\delta _i(e)=\delta _{j-1}(e)\), namely, the last value assigned to a dual variable.

Claim 3

For every \(i\ge 1\) and every a vertex v that has not terminated the following inequality holds:

$$\begin{aligned} w(v)(1-0.5^{\ell _i(v)}) \!\le \! \sum _{e\!\in \! E(v)}\delta _{i-1}(e)\!\le \!(1-0.5^{\ell _i(v)+1})\cdot w(v) \;. \end{aligned}$$
(1)

In addition

the dual variables \(\delta _i(e)\) constitute a feasible edge packing. Namely,

$$\begin{aligned} \sum _{e\in E(v)} \delta _i(e)&\le w(v)&\text {for every vertex }v\in V, \end{aligned}$$
(2)
$$\begin{aligned} \delta _i(e)&\ge 0&\text {for every edge }e\in E. \end{aligned}$$
(3)

Proof

We prove the claim by induction on the iteration number i. To simplify the proof, we reformulate the statement of the feasibility of the dual variables to \(i-1\), i.e., \(\sum _{e\in E(v)} \delta _{i-1}(e) \le w(v)\) and \(\delta _{i-1}(e)\ge 0\). We first prove the induction basis for \(i=1\).

Proof of Eq. 1 for \(i=1\).  Fix a vertex v. At the end of iteration 0, \(\ell _0(v)=0\) and \(0<{{\,\mathrm{bid}\,}}_0(e)\le w(v)/(2|E(v)|)\), for every \(e\in E(v)\). Hence \(0<\sum _{e\in E(v)} {{\,\mathrm{bid}\,}}_0(e) \le w(v)/2\). Because \(\delta _0(e)={{\,\mathrm{bid}\,}}_0(e)\), the condition in Step 3d does not hold, and \(\ell _1(v)=\ell _0(v)=0\). We conclude that Eq. 1 holds for \(i=1\).

Proof of feasibility of \(\delta _0(e)\) (Eq. (2) and  (3)) for \(i=1\). Non-negativity follows from the fact that \(\delta _0(e)={{\,\mathrm{bid}\,}}_0(e)>0\). The packing constraint for vertex v is satisfied because \(\sum _{e\in E(v)}{{\,\mathrm{bid}\,}}_0(e) \le w(v)/2\). This completes the proof of the induction basis.

We now prove the induction step assuming that Eq. 1 holds for \(\delta _{i-1}\).

Proof of Eq. 1 for \(i>1\). Since v is not in the cover it is also not \(\beta \)-tight. Step 3d in iteration i increases \(\ell (v)\) until Eq. 1 holds for i.

Proof of feasibility of \(\delta _{i-1}(e)\) (Eq. (2) and (3)) for \(i>1\).  Consider a vertex v. If v joins C in iteration \(i-1\), then \(\delta _{i-1}=\delta _{i-2}\), and the packing constraint of v holds by the induction hypothesis. Otherwise, then by \(\delta _{i-1}(e)=\delta _{i-2}(e)+{{\,\mathrm{bid}\,}}_{i-1}(e)\), Claim 2, and the induction hypothesis for Eq. 1, we have

$$\begin{aligned} \sum _{e\in E(v)} \delta _{i-1}(e)&= \sum _{e\in E(v)} \left( \delta _{i-2}(e)+\textstyle {{{\,\mathrm{bid}\,}}_{i-1}(e)}\right) \\&\le \left( 1-0.5^{\ell _{i-1}(v)+1}+0.5^{\ell _{i-1}(v)+1}\right) \cdot w(v)=w(v)\;. \end{aligned}$$

\(\square \)

Let \(\textsf {opt}\) denote the cost of an optimal (fractional) weighted vertex cover of G.

Corollary 1

Upon termination, the approximation ratio of Algorithm mwhvc is \(f+\varepsilon \).

Proof

Throughout the algorithm, the set C consists of \(\beta \)-tight vertices. By Claim 1, \(w(C)\le (f+\varepsilon )\cdot \textsf {opt}\). Upon termination, C constitutes a vertex cover (as an edge only terminate once covered), and the corollary follows. \(\square \)

4.2 Communication rounds analysis

In this section, we prove that the number of communication rounds of Algorithm mwhvc is bounded by (where \(\gamma >0\) is a constant, e.g., \(\gamma =0.001\))

$$\begin{aligned}&O\left( f\cdot \log (f/\varepsilon ) + \frac{\log \varDelta }{\gamma \cdot \log \log \varDelta }\right. \nonumber \\&\quad \left. +\min \left\{ \log \varDelta ,f\cdot \log (f/\varepsilon )\cdot (\log \varDelta )^{\gamma } \right\} \right) . \end{aligned}$$

It suffices to bound the number of iterations because each iteration consists of a constant number of communication rounds.

Let \(z\triangleq \lceil \log _2 {\frac{1}{\beta }} \rceil \). Note that \(z=O\left( \log ({f}/{\varepsilon })\right) \).

Claim 4

The level of every vertex is always less than z.

Proof

Assume that \(\ell _i(v)\ge z\). By Eq. 1, \(\sum _{e\in E(v)} \delta _{i-1}(e) \ge w(v)\cdot (1-2^{-z})\ge (1-\beta )\cdot w(v)\). This implies that v is \(\beta \)-tight and joins the cover in Line 3a before \(\ell _i(v)\) reaches z.

\(\square \)

4.2.1 Raise or stuck iterations

Definition 1

(e-raise and v-stuck iterations) An iteration \(i\ge 1\) is an e-raise iteration if in Line 3f we multiplied \({{\,\mathrm{bid}\,}}(e)\) by \(\alpha \). An iteration \(i\ge 1\) is a v-stuck iteration if v sent the message “stuck” in iteration i.

Note that if iteration i is a v-stuck iteration and \(v\in e\), then \({{\,\mathrm{bid}\,}}_i(e)\le {{\,\mathrm{bid}\,}}_{i-1}(e)\) and i is not an e-raise iteration.

We bound the number of e-raise iterations as follows.

Lemma 1

The number of e-raise iterations is bounded by \(\log _\alpha ( \varDelta \cdot 2^{f\cdot z})\).

Proof

Let \(v^*\) denote a vertex with minimum normalized weight in e (that is, \(v^*\in \text {argmin}_{v\in e}\{ w(v)/|E(v)|\)}). The first bid satisfies \({{\,\mathrm{bid}\,}}_0(e)= 0.5\cdot w(v^*)/|E(v^*)| \ge 0.5\cdot w(v^*)/\varDelta \). By Claim 2, \({{\,\mathrm{bid}\,}}_i(e)\le 0.5\cdot w(v^*)\). The bid is multiplied by \(\alpha \) in each e-raise iteration and is halved at most \(f \cdot z\) times. The bound on the number of halvings holds because the number of vertices in the edge is bounded by f, and each vertex halves the bid each time its level is incremented. The lemma follows. \(\square \)

We bound the number of v-stuck iterations as follows.

Lemma 2

For every vertex v and level \(\ell (v)\), the number of v-stuck iterations is bounded by \(\alpha \).

Proof

Notice that when v reached the level \(\ell (v)\), we had \(\sum _{e\in E(v)} \delta (e) \ge w(v)(1-0.5^{\ell (v)})\). The number of v-stuck iterations is then bounded by the number of times it can send a “stuck” message without reaching \(\sum _{e\in E(v)} \delta (e) > w(v)(1-0.5^{\ell (v)+1})\). Indeed, once this inequality holds, the level of v is incremented. Every stuck iteration implies, by Line 3e, that \(\sum _{e\in E'(v)} {{\,\mathrm{bid}\,}}(e) > \frac{1}{\alpha }\cdot 0.5^{\ell (v)+1} \cdot w(v)\). Therefore, we can bound the number of iteration by \( \frac{w(v)(1-0.5^{\ell (v)+1}) - w(v)(1-0.5^{\ell (v)})}{ \frac{1}{\alpha }\cdot 0.5^{\ell (v)+1} \cdot w(v)} = \alpha \). \(\square \)

4.2.2 Bound on the number of rounds

Theorem 1

For every \(\alpha \ge 2\), the number of iterations of Algorithm mwhvc is

$$\begin{aligned} O\left( \log _\alpha (\varDelta \cdot 2^{f\cdot z}) + f\cdot z\cdot \alpha \right) =O\left( \log _\alpha \varDelta + f\cdot \log (f/\varepsilon )\cdot \alpha \right) . \end{aligned}$$

Proof

Fix an edge e. We bound the number of iterations until e is covered. Every iteration is either an e-raise iteration or a v-stuck iteration for some \(v\in e\). Since e contains at most f vertices, we conclude that the number of iterations is bounded by the number of e-raise iterations plus the sum over \(v\in e\) of the number of v-stuck iterations. The theorem follows from Lemmas 1 and 2 . \(\square \)

In Theorem 2 we assume that all the vertices know the maximum degree \(\varDelta \) and that \(\varDelta \ge 3\). The assumption that the maximal degree \(\varDelta \) is known to all vertices is not required. Instead, each hyperedge e can compute a local maximum degree \(\varDelta (e)\), where \(\varDelta (e)\triangleq \max _{u\in e}|E(u)|\). The local maximum degree \(\varDelta (e)\) can be used instead of \(\varDelta \) to define local value of the multiplier \(\alpha =\alpha (e)\). Let \(T(f,\varDelta ,\varepsilon )\) denote the round complexity of Algorithm mwhvc. By setting \(\alpha \) appropriately, we bound the running time as follows.

Theorem 2

Footnote 4 Let \(\gamma >0\) denote a constant and set

$$\begin{aligned} \alpha&= {\left\{ \begin{array}{ll} \max \left( 2,\frac{\log \varDelta }{f\cdot \log (f/\varepsilon )\cdot \log \log \varDelta }\right) &{}\hbox {if }\frac{\log \varDelta }{f\cdot \log (f/\varepsilon )\cdot \log \log \varDelta }\ge (\log \varDelta )^{\gamma /2}\\ 2&{}\hbox {otherwise.} \end{array}\right. } \end{aligned}$$

Then, the round complexity of Algorithm mwhvc satisfies:

$$\begin{aligned} T(f,\varDelta ,\varepsilon )= & {} O \left( f\cdot \log (f/\varepsilon ) + \frac{\log \varDelta }{\log \log \varDelta }\right. \\&\left. \quad +\min \left\{ \log \varDelta ,f\cdot \log (f/\varepsilon )\cdot (\log \varDelta )^{\gamma } \right\} \right) . \end{aligned}$$

Proof

We prove the theorem using a case analysis. Case \(\alpha =\frac{\log \varDelta }{f\cdot \log (f/\varepsilon )\cdot \log \log \varDelta }\ge 2\) and \(\alpha \ge (\log \varDelta )^{\gamma /2}\).

This means that the runtime is bounded by \(O \big (\log _\alpha \varDelta + f\cdot \log (f/\varepsilon )\cdot \alpha \big ) = O\left( \frac{\log \varDelta }{\log \log \varDelta }\right) .\)

Case \(\alpha =2\) and \(2>\frac{\log \varDelta }{f\cdot \log (f/\varepsilon )\cdot \log \log \varDelta }\ge (\log \varDelta )^{\gamma /2}\).

Notice that in this case \(2> (\log \varDelta )^{\gamma /2}\) which implies that \(\varDelta \) is constant. Therefore, the round complexity of our algorithm is bounded by

$$\begin{aligned}O\left( \log \varDelta + f\cdot \log (f/\varepsilon )\right) =O\left( f\cdot \log (f/\varepsilon )\right) . \; \end{aligned}$$

Case \(\frac{\log \varDelta }{f\cdot \log (f/\varepsilon )\cdot \log \log \varDelta }< (\log \varDelta )^{\gamma /2}\).

This implies that

$$\begin{aligned} \log \varDelta \le&\min \left\{ \log \varDelta ,f\cdot \log (f/\varepsilon )\cdot (\log \varDelta )^{\gamma /2}\cdot \log \log \varDelta \right\} \\&=O\left( \min \left\{ \log \varDelta ,f\cdot \log (f/\varepsilon )\cdot (\log \varDelta )^{\gamma } \right\} \right) . \end{aligned}$$

Therefore, since \(\alpha =2\) in this case, the runtime is bounded by

$$\begin{aligned}&O\left( \log \varDelta + f\cdot \log (f/\varepsilon )\right) \\&\quad = O\left( f\cdot \log (f/\varepsilon ) + \min \left\{ \log \varDelta ,f\cdot \log (f/\varepsilon )\cdot (\log \varDelta )^{\gamma } \right\} \right) . \end{aligned}$$

\(\square \)

Let \(W\triangleq \max _v w(v)\). By setting \(\varepsilon =1/(nW)\), we conclude the following result for an f-approximation (recall that the vertex degrees and weights are polynomial in n):

Corollary 2

Algorithm mwhvc computes an f-approximation in \(O(f\log n)\) rounds.

Additionally, we get the following range of parameters for which the round complexity is optimal:

Corollary 3

Let \(f=O\left( (\log \varDelta )^{0.99}\right) \) and \(\varepsilon =(\log \varDelta )^{-O(1)}\). Then, Algorithm mwhvc computes an \((f+\varepsilon )\)-approximation in \(O\left( \frac{\log \varDelta }{ \log \log \varDelta }\right) \) rounds.

For \(f=O(1)\) we also get an extension of range of parameters for which the round complexity is optimal. This extension is almost exponential compared to the \(\varepsilon =(\log \varDelta )^{-O(1)}\) of [5].

Corollary 4

Let \(f=O(1)\) and \(\varepsilon =2^{-O\left( (\log \varDelta )^{0.99}\right) }\). Then our algorithm computes an \((f+\varepsilon )\)-approximation and terminates in \(O\left( \frac{\log \varDelta }{ \log \log \varDelta }\right) \) rounds.

5 Approximation of covering ILPs

In this section, we present a reduction from solving covering integer linear programs (ILPs) to mwhvc. This reduction implies that one can distributively compute approximate solutions to covering ILPs using a distributed algorithm for mwhvc.

Notation. Let \(\mathbb {N}\) denote the set of natural numbers, including 0. Let A denote a real \(m\times n\) matrix, \(\mathbf {b}\in \mathbb {R}^n\), and \(\mathbf {w}\in \mathbb {R}^n\). Let \(LP(A,\mathbf {b},\mathbf {w})\) denote the linear program \(\min ~\mathbf {w}^{T} \cdot \mathbf {x}\) subject to \(A\cdot \mathbf {x} \ge \mathbf {b}\) and \(\mathbf {x} \ge 0\). Let \(ILP(A,\mathbf {b},\mathbf {w})\) denote the integer linear program \(\min ~\mathbf {w}^{T} \cdot \mathbf {x}\) subject to \(A\cdot \mathbf {x} \ge \mathbf {b}\) and \(\mathbf {x} \in \mathbb {N}^n\).

Definition 2

The linear program \(LP(A,\mathbf {b},\mathbf {w})\) and integer linear program \(ILP(A,\mathbf {b},\mathbf {w})\) are covering programs if all the components in \(A,\mathbf {b}, \mathbf {w}\) are non-negative.

5.1 Distributed setting

We denote the number of rows of the matrix A by m and the number of columns by n. Let f(A) (resp., \(\varDelta (A)\)) denote the maximum number of nonzero entries in a row (resp., column) of A.

Given a covering ILP, \(ILP(A,\mathbf {b},\mathbf {w})\), the communication network N(ILP) over which the ILP is solved is the bipartite graph \(N=(X\times C, E)\), where: \(X=\{x_j\}_{j\in [n]}\), \(C=\{c_i\}_{i\in [m]}\), and \(E=\{(x_j,c_i)\mid A_{i,j}\ne 0\}\). We refer to the nodes in X as variable nodes, and to those in C as constraint nodes. Note that the maximum degree of a constraint node is f(A) and the maximum degree of a variable node is \(\varDelta (A)\).

We assume that the local input of every variable node \(x_j\) consists of \(w_j\) and the j’th column of A (i.e., \(A_{i,j}\), for \(i\in [m]\)). Every constraint vertex \(c_i\) is given the value of \(b_i\) as its local input. We assume that these inputs can be represented by \(O(\log (nm))\) bits. In \((f+1)\) rounds, every variable node \(v_j\) can learn all the components in the rows i of A such that \(A_{i,j}\ne 0\) as well as the component \(b_i\).

5.2 Zero-one covering programs

The special case in which a variable may be assigned only the values 0 or 1 is called a zero-one program. We denote the zero-one covering ILP induced by a matrix A and vectors \(\mathbf {b}\) and \(\mathbf {w}\) by \(ZO(A,\mathbf {b},\mathbf {z})\). Every instance of the mwhvc problem is a zero-one program in which the matrix A is the incidence matrix of the hypergraph. The following lemma deals with the converse reduction.

Lemma 3

Every feasible zero-one covering program \(ZO(A,\mathbf {b}, \mathbf {w})\) can be reduced to an mwhvc instance with rank \(f'<f(A)\) and degree \(\varDelta '< 2^{f(A)}\cdot \varDelta (A)\).

Proof

Let \(A_i\) denote the i’th row of the matrix A. For a subset \(S\subseteq [n]\), let \(I_S\) denote the indicator vector of S. Let \(\mathbf {x}\in \{0,1\}^n\) and let \(\sigma _i\triangleq \{j\in [n] \mid A_{i,j}\ne 0\}\). Feasibility implies that \(A_i\cdot I_{\sigma _i} \ge b_i\), for every row i. Let \(\mathcal {S}_i\) denote the set of all subsets \(S\subset [n]\) such that the indicator vector \(I_S\) does not satisfy the i’th constraint, i.e., \(A_i \cdot I_S < b_i\) The i’th constraint is not satisfied, i.e., \(A_i\cdot \mathbf {x}<b_i\), if and if only there exists a set \(S\in \mathcal {S}_i\) such that \(\mathbf {x}=I_S\). Hence, \(A_i\cdot \mathbf {x} < b_i\) if and only if the truth value of the following DNF formula is true: \(\varphi _i(x) \triangleq \bigvee _{S\in \mathcal {S}_i}~ \bigwedge _{j\in \sigma _i\setminus S} \text {not}({x}_j)\). By De Morgan’s law, we obtain that \(\text {not}(\varphi _i(x))\) is equivalent to a monotone CNF formula \(\psi _i(x)\) such that \(\psi _i(x)\) has less than \(2^{f(A)}\) clauses, each of which has length less than f(A). We now construct the hypergraph H for the mwhvc instance as follows. For every row i and every \(S\in \mathcal {S}_i\), add the hyperedge \(e_{i,S}=\sigma _i\setminus S_i\). (Feasibility implies that \(e_{i,S}\) is not empty.) Given a vertex cover C of the hypergraph, every hyperedge is stabbed by C, and hence \(I_C\) satisfies all the formulae \(\psi _i(x)\), where \(i\in [m]\). Hence, \(A_i\cdot I_C \ge b_i\), for every i. The converse direction holds as well, and the lemma follows. \(\square \)

How does N(ILP) (a bipartite graph with \(m+n\) vertices) simulate the execution of mwhvc over the hypergraph H? Each variable node \(x_j\) simulates all hyperedges \(e_{i,S}\), where \(j\in \sigma _i\) and \(S\in \mathcal {S}_i\). First, the variable nodes exchange their weights with the variables nodes they share a constraint with in O(f(A)) rounds – first every \(x_j\) broadcasts its own weight and then each \(c_i\) sends all neighbors the weights it received. At each iteration, the variable node sends a raise/stuck message and whether its level was incremented.Footnote 5 Notice that the number of rounds required for each such iteration is \(O(1+f(A)/\log n)\) (i.e., constant for \(f(A)=O(\log n)\)). Each edge node \(c_i\) then broadcasts to all vertices two f(A)-bit messages that indicate two subsets of vertices of the edge: those that sent a raise message (the complement sent a stuck message) and those that incremented their level. Each variable node \(x_j\) knows how to update its bid with every \(e_{i,S}\) for which \(j\in \sigma _i\) and \(S\in \mathcal {S}_i\).

We summarize the complexity of the distributed algorithm for solving a zero-one covering program.

Claim 5

Denoting by \(T(f,\varDelta ,\varepsilon )\) is the running time of Algorithm mwhvc, There exists a distributed congest algorithm for computing an \((f+\varepsilon )\)-approximate solution for zero-one covering programs with running time of \(O({(1+f(A)/\log n)\cdot } T(f(A), 2^{f(A)}\cdot \varDelta (A),\varepsilon ))\).

5.3 Reduction of covering ILPs to zero-one covering

Consider the covering ILP \(ILP(A,\mathbf {b},\mathbf {w})\). We present a reduction of the ILP to a zero-one covering program.

Definition 3

Define \(M(A,\mathbf {b})\!\triangleq \! \max _j \max _i \left\{ \frac{b_i}{A_{i,j}}\mid A_{i,j}\!\ne \! 0 \right\} \).

We abbreviate and write M for \(M(A,\mathbf {b})\) when the context is clear.

Proposition 1

Limiting \(\mathbf {x}\) to the box \([0,M]^{n}\) does not increase the optimal value of the ILP.

Claim 6

Every covering ILP, \(ILP(A,\mathbf {b},\mathbf {w})\), can be approximated by a zero-one covering program \(ZO(A',\mathbf {b},\mathbf {w'})\), where \(f(A')\le f(A)\cdot \lceil \log _2(M)+1 \rceil \) and \(\varDelta (A')=\varDelta (A)\).

Proof

Let \(B = \lceil \log _2 M \rceil \). Limiting each variable \(x_j\) by M means that we can replace \(x_j\) by B zero-one variables \(\left\{ x_{j,\ell } \right\} _{\ell =0}^{B-1}\) that correspond to the binary representation of \(x_j\), i.e., \(x_j=\sum _{\ell =0}^{B-1} 2^{\ell }\cdot x_{j,\ell }\). This replacement means that the dimensions of the matrix \(A'\) are \(m\times n'\), where \(n'=n\cdot B\). The j’th column \(A^{(j)}\) of A is replaced by B columns, indexed 0 to \(B-1\), where the \(\ell \)’th column equals \(2^{\ell }\cdot A^{(j)}\). The vector \(\mathbf {w'}\) is obtained by duplicating and scaling the entries of \(\mathbf {w}\) in the same fashion. \(\square \)

Combining Claims 5 and 6 , we obtain the following result.

Theorem 3

There exists a distributed congest algorithm for computing an \((f(A')+\varepsilon )=(f(A)\cdot \lceil \log _2(M)+1 \rceil +\varepsilon )\)-approximate solution for covering integer linear programs \(ILP(A,\mathbf {b},\mathbf {w})\) with running time of \(O({(1+f(A)/\log n)\cdot } T(f(A)\cdot \log M, 2^{f(A)}\cdot M\cdot \varDelta (A),\varepsilon ))\), where \(T(f,\varDelta ,\varepsilon )\) is the running time of Algorithm mwhvc.Footnote 6

Proof

Let \(f_{HVC}, f_{ZO}, f_{ILP}\) denote the ranks of the hypergraph vertex cover instance, zero-one covering program, and ILP, respectively. We use the same notation for maximum degrees. The reduction of zero-one programs to mwhvc in Lemma 3 implies that \(f_{HVC}\le f_{ZO}\) and \(\varDelta _{HVC}\le 2^{f_{ZO}}\cdot \varDelta _{ZO}\). The reduction of covering ILPs to zero-one programs in Claim 6 implies that \(f_{ZO}\le f_{ILP}\cdot (1+\log M)\) and \(\varDelta _{ZO}\le \varDelta _{ILP}\). The composition of the reductions gives \(f_{HVC}\le f_{ILP}\cdot (1+\log M)\) and \(\varDelta _{HVC}\le 2^{f_{ILP}\cdot (1+\log M)}\cdot \varDelta _{ILP}=2^{f_{ILP}}\cdot 2M\cdot \varDelta _{ILP}\). \(\square \)

After some simplifications, the running time of the resulting algorithm for \((f\cdot \lceil \log _2(M)+1 \rceil +\varepsilon )\)-approximate integer covering linear programs is

$$\begin{aligned}&O\left( (1+f/\log n)\right. \\&\quad \left. \cdot \left( \frac{\log \varDelta }{ \log \log \varDelta } + \left( f\cdot \log M\right) ^{1.01}\cdot \log \varepsilon ^{-1}\cdot (\log \varDelta )^{0.01}\right) \right) \end{aligned}$$