Approximation Algorithms: Covering problems

This is the lecture notes from Chandra Chekuri's CS583 course on Approximation Algorithms. Chapter 2: Covering problems. You can read Chapter 3: Knapsack, here. Chapter 1: Introduction, here.
Chapter 2
Part of these notes were scribed by Abul Hassan Samee and Lewis Tseng.
Packing and Covering problems together capture many important problems in combinatorial optimization. We will discuss several covering problems in this chapter. Two canonical one problems are Minimum Vertex Cover and its generalization Minimum Set Cover. (Typically we will omit the use of the qualifiers minimum and maximum since this is often clear from the definition of the problem and the context.) They play an important role in the study of approximation algorithms.
A vertex cover in an undirected graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) is a set S V S V S sube VS \subseteq V of vertices such that for each edge e E e E e in Ee \in E, at least one of its end points is in S S SS. It is also called a node cover. In the Vertex Cover problem, our goal is to find a smallest vertex cover of G G GG. In the weighted version of the problem, a weight function w : V R + w : V R + w:V rarrR^(+)w: V \rightarrow \mathcal{R}^{+} is given, and our goal is to find a minimum weight vertex cover of G G GG. The unweighted version of the problem is also known as Cardinality Vertex Cover. Note that we are picking vertices to cover the edges. Vertex Cover is NP NP NP\operatorname{NP}-Hard and is on the list of problems in Karp's list.
In the Set Cover problem the input is a set U U U\mathcal{U} of n n nn elements, and a collection S = { S 1 , S 2 , , S m } S = S 1 , S 2 , , S m S={S_(1),S_(2),dots,S_(m)}\mathcal{S}=\left\{S_{1}, S_{2}, \ldots, S_{m}\right\} of m m mm subsets of U U U\mathcal{U} such that i S i = U i S i = U uuu_(i)S_(i)=U\bigcup_{i} S_{i}=\mathcal{U}. Our goal in the Set Cover problem is to select as few subsets as possible from S S S\mathcal{S} such that their union covers U U U\mathcal{U}. In the weighted version each set S i S i S_(i)S_{i} has a non-negative weight w i w i w_(i)w_{i} the goal is to find a set cover of minimim weight. Closely related to the Set Cover problem is the Maximum Coverage problem. In this problem the input is again U U U\mathcal{U} and S S S\mathcal{S} but we are also given an integer k m k m k <= mk \leq m. The goal is to select k k kk subsets from S S S\mathcal{S} such that their union has the maximum cardinality. Note that Set Cover is a minimization problem while Maximum Coverage is a maximization problem. Set Cover is essentially equivalent to the Hitting Set problem. In Hitting Set the input is U U U\mathcal{U} and S S S\mathcal{S} but the goal is to pick the smallest number of elements of U U U\mathcal{U} that cover the given sets in S S S\mathcal{S}. In other words we are seeking a set cover in the dual set system. It is easy to see Vertex Cover is a special case of Set Cover.
Set Cover is an important problem because in discrete optimization. In the standard definition the set system is given explicitly. In many applications the set system is implicit, and often exponential in the explicit part of the input; nevertheless such set systems are ubiquitious and one can often obtain exact or approximation algorithms. As an example consider the well known MST MST MST\operatorname{MST} problem in graphs. One way to phrase MST MST MST\operatorname{MST} is the following: given an edgeweighted graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) find a minimum cost subset of the edges that cover all the cuts of G G GG; by cover a cut S V S V S sube VS \subseteq V we mean that at least one of the edges in δ ( S ) δ ( S ) delta(S)\delta(S) must be chosen. This may appear to be a strange way of looking at the MST MST MST\operatorname{MST} problem but this view is useful as we will see later. Another implicit example is the following. Suppose we are given n n nn rectangles in the plane and the goal is to choose a minimum number of points in the plane such that each input rectangle contains one of the chosen points. This is perhaps more natural to view as a special case of the Hitting Set problem. In principle the set of points that we can choose from is infinite but it can be seen that we can confine our attention to vertices in the arrangement of the given rectangles and it is easy to see that there are only O ( n 2 ) O n 2 O(n^(2))O\left(n^{2}\right) vertices–however, explicity computing them may be expensive and one may want to treat the problem as an implicit one for the sake of efficiency. want to think of
Covering problems have the feature that a superset of a feasible solution is also a feasible solution. More abstractly one can cast covering problems as the following. We are given a finite ground set V V VV (vertices in a graph or sets in a set system) and a family of feasible solutions I 2 V I 2 V Isube2^(V)\mathcal{I} \subseteq 2^{V} where I I I\mathcal{I} is upward closed; by this we mean that if A I A I A inIA \in \mathcal{I} and A B A B A sub BA \subset B then B I B I B inIB \in \mathcal{I}. The goal is to find the smallest cardinality set A A AA in I I I\mathcal{I}. In the weighted case V V VV has weights and the goal is to find a minimum weight set in I I I\mathcal{I}. In some case one can also consider more complex non-additive objectives that assign a cost c ( S ) c ( S ) c(S)c(S) for each S I S I S inIS \in \mathcal{I}.

2.1 Greedy for Set Cover and Maximum Coverage

In this section we consider the unweighted version of Set Cover.

2.1.1 Greedy Algorithm

A natural greedy approximation algorithm for these problems is easy to come up with.
Greedy Cover ( U , S ) _ Greedy Cover ( U , S ) _ "Greedy Cover"(U,S)_\underline{\text{Greedy Cover} (\mathcal{U}, \mathcal{S})}
  1. repeat
    • A.pick the set that covers the maximum number of uncovered elements
    • B.mark elements in the chosen set as covered
  2. until done
In case of Set Cover, the algorithm Greedy Cover is done when all the elements in set U U U\mathcal{U} have been covered. And in case of Maximum Coverage, the algorithm is done when exactly k k kk subsets have been selected from S S S\mathcal{S}.
We will prove the following theorem.
Theorem 2.1. Greedy Cover is a 1 ( 1 1 / k ) k ( 1 1 e ) 0.632 1 ( 1 1 / k ) k 1 1 e 0.632 1-(1-1//k)^(k) >= (1-(1)/(e))≃0.6321-(1-1 / k)^{k} \geq\left(1-\frac{1}{e}\right) \simeq 0.632 approximation for Maximum Coverage , and a ( ln n + 1 ) ( ln n + 1 ) (ln n+1)(\ln n+1) approximation for Set Cover.
The following theorem due to Feige[1] implies that Greedy Cover is essentially the best possible in terms of the approximation ratio that it guarantees.
Theorem 2.2. Unless N P D T I M E ( n O ( log log n ) ) N P D T I M E n O ( log log n ) NPsubeDTIME(n^(O(log log n)))\mathbf{NP} \subseteq \mathbf{DTIME}\left(n^{O(\log \log n)}\right), there is no ( 1 o ( 1 ) ) ln n ( 1 o ( 1 ) ) ln n (1-o(1))ln n(1-o(1)) \ln n approximation for Set Cover. Unless P = N P P = N P P=NPP=N P, for any fixed ϵ > 0 ϵ > 0 epsilon > 0\epsilon>0, there is no ( 1 1 e ϵ ) 1 1 e ϵ (1-(1)/(e)-epsilon)\left(1-\frac{1}{e}-\epsilon\right) approximation for Maximum Coverage.
Recently the preceding theorem has been strengthened so that the hardness holds under the assumption that N P P N P P NP!=PN P \neq P [2].

2.1.2 Analysis of Greedy Cover

We proceed towards the proof of Theorem 10.3 by providing analysis of Greedy Cover separately for Set Cover and Maximum Coverage.
Analysis for Maximum Coverage
Let OPT OPT OPT\operatorname{OPT} denote the value of an optimal solution to the Maximum Coverage problem; this is the maximum number of elements that are covered by k k kk sets in the given set system. Let x i x i x_(i)x_{i} denote the number of new elements covered by the i i ii-th set chosen by Greedy Cover. Also, let y i = j = 1 i x i y i = j = 1 i x i y_(i)=sum_(j=1)^(i)x_(i)y_{i}=\sum_{j=1}^{i} x_{i} be the total number of elements covered after i i ii iterations, and z i = OPT y i z i = OPT y i z_(i)=OPT-y_(i)z_{i}=\operatorname{OPT}-y_{i}. Note that, according to our notation, y 0 = 0 y 0 = 0 y_(0)=0y_{0}=0 and y k y k y_(k)y_{k} is the number of elements chosen by Greedy Cover at the end of the algorithm, and z 0 = OPT z 0 = OPT z_(0)=OPTz_{0}= \operatorname{OPT}. The key to the analysis is the following simple claim.
Claim 2.1.1. For 0 i < k , x i + 1 z i k 0 i < k , x i + 1 z i k 0 <= i < k,x_(i+1) >= (z_(i))/(k)0 \leq i<k, x_{i+1} \geq \frac{z_{i}}{k}.
Proof. Let F U F U F^(**)subeUF^{*} \subseteq \mathcal{U} be the elements covered by some fixed optimum solution; we have | F | = OPT F = OPT |F^(**)|=OPT\left|F^{*}\right|= \operatorname{OPT}. Consider iteration i + 1 i + 1 i+1i+1. Greedy Cover selects the subset S j S j S_(j)S_{j} whose inclusion covers the maximum number of uncovered elements. Since z i z i z_(i)z_{i} is the total number of elements covered upto iteration i i ii, at least OPT z i OPT z i OPT-z_(i)\operatorname{OPT} -z_{i} elements from F F F^(**)F^{*} are uncovered. Let the set of uncovered elements from F F F^(**)F^{*} at the end of iteration i i ii be F i F i F_(i)^(**)F_{i}^{*}. Since k k kk sets together cover F F F^(**)F^{*}, and hence F i F i F_(i^(**))F_{i^{*}} as well, there must be some set in that collection of k k kk sets that covers at least | F i | / k F i / k |F_(i)^(**)|//k\left|F_{i}^{*}\right| / k elements. This is a candidate set that can be chosen in iteration i + 1 i + 1 i+1i+1. Since the algorithm picks the set that covers the maximum number of uncovered elements, the chosen set in iteration i + 1 i + 1 i+1i+1 covers at least | F i | / k = z i k F i / k = z i k |F_(i)^(**)|//k=(z_(i))/(k)\left|F_{i}^{*}\right| / k=\frac{z_{i}}{k} uncovered elements. Hence, x i + 1 z i k x i + 1 z i k x_(i+1) >= (z_(i))/(k)x_{i+1} \geq \frac{z_{i}}{k}.
Remark 2.1. It is tempting to make a stronger claim that x i + 1 z i k i x i + 1 z i k i x_(i+1) >= (z_(i))/(k-i)x_{i+1} \geq \frac{z_{i}}{k-i}. This is however false, and it is worthwhile to come up with an example.
By definition we have y k = x 1 + x 2 + + x k y k = x 1 + x 2 + + x k y_(k)=x_(1)+x_(2)+dots+x_(k)y_{k}=x_{1}+x_{2}+\ldots+x_{k} is the total number of elements covered by Greedy Cover. To analyze the worst-case we want to make this sum as small as possible given the preceding claim. Heuristically (which one can formalize), one can argue that choosing x i + 1 = z i / k x i + 1 = z i / k x_(i+1)=z_(i)//kx_{i+1}=z_{i} / k minimizes the sum. Using this one can argue that the sum is at least ( 1 ( 1 1 / k ) k ) OPT 1 ( 1 1 / k ) k OPT (1-(1-1//k)^(k))OPT\left(1-(1-1 / k)^{k}\right) \operatorname{OPT}. We give a formal argument now.
Claim 2.1.2. For i 0 , z i ( 1 1 k ) i OPT i 0 , z i 1 1 k i OPT i >= 0,z_(i) <= (1-(1)/(k))^(i)*OPTi \geq 0, z_{i} \leq\left(1-\frac{1}{k}\right)^{i}\cdot \operatorname{OPT}.
Proof. By induction on i i ii. The claim is trivially true for i = 0 i = 0 i=0i=0 since z 0 = OPT z 0 = OPT z_(0)=OPTz_{0}=\operatorname{OPT}. We assume inductively that z i ( 1 1 k ) i OPT z i 1 1 k i OPT z_(i) <= (1-(1)/(k))^(i)*OPTz_{i} \leq\left(1-\frac{1}{k}\right)^{i} \cdot \operatorname{OPT}. Then
z i + 1 = z i x i + 1 z i ( 1 1 k ) [using Claim 2.1.1] ( 1 1 k ) i + 1 OPT . z i + 1 = z i x i + 1 z i 1 1 k  [using Claim 2.1.1]  1 1 k i + 1 OPT . {:[z_(i+1)=z_(i)-x_(i+1)],[ <= z_(i)(1-(1)/(k))quad" [using Claim 2.1.1] "],[ <= (1-(1)/(k))^(i+1)*OPT.]:}\begin{aligned} z_{i+1} &=z_{i}-x_{i+1} \\ & \leq z_{i}\left(1-\frac{1}{k}\right) \quad \text{ [using Claim 2.1.1] } \\ & \leq\left(1-\frac{1}{k}\right)^{i+1} \cdot \operatorname{OPT} . \end{aligned}
The preceding claims yield the following lemma for algorithm Greedy Cover when applied on Maximum Coverage.
Lemma 2.1. Greedy Cover is a 1 ( 1 1 / k ) k 1 1 e 1 ( 1 1 / k ) k 1 1 e 1-(1-1//k)^(k) >= 1-(1)/(e)1-(1-1 / k)^{k} \geq 1-\frac{1}{e} approximation for Maximum Coverage.
Proof. It follows from Claim 2.1.2 that z k ( 1 1 k ) k OPT OPT e z k 1 1 k k OPT OPT e z_(k) <= (1-(1)/(k))^(k)*OPT <= (OPT )/(e)z_{k} \leq\left(1-\frac{1}{k}\right)^{k}\cdot\operatorname{OPT} \leq \frac{\operatorname{OPT}}{e}. Hence, y k = OPT z k ( 1 1 e ) OPT y k = OPT z k 1 1 e OPT y_(k)=OPT-z_(k) >= (1-(1)/(e))*OPTy_{k}=\operatorname{OPT}-z_{k} \geq\left(1-\frac{1}{e}\right) \cdot\operatorname{OPT}.
We note that ( 1 1 / e ) 0.632 ( 1 1 / e ) 0.632 (1-1//e)≃0.632(1-1 / e) \simeq 0.632.
Analysis for Set Cover
Let k k k^(**)k^{*} denote the value of an optimal solution to the Set Cover problem. Then an optimal solution value to the Мaximum Coverage problem with the same system and k = k k = k k=k^(**)k=k^{*} would by n = | U | n = | U | n=|U|n=|\mathcal{U}| since it is possible to cover all the n n nn elements in set U U U\mathcal{U} with k k k^(**)k^{*} sets. From our previous analysis, z k n e z k n e z_(k^(**)) <= (n)/(e)z_{k^{*}} \leq \frac{n}{e}. Therefore, at most n e n e (n)/(e)\frac{n}{e} elements would remain uncovered after the first k k k^(**)k^{*} steps of Greedy Cover. Similarly, after 2 k 2 k 2*k^(**)2 \cdot k^{*} steps of Greedy Cover, at most n e 2 n e 2 (n)/(e^(2))\frac{n}{e^{2}} elements would remain uncovered. This easy intuition convinces us that Greedy Cover is a ( ln n + 1 ) ( ln n + 1 ) (ln n+1)(\ln n+1) approximation for the Set Cover problem. A more formal proof is given below.
Lemma 2.2. Greedy Cover is a ( ln n + 1 ) ( ln n + 1 ) (ln n+1)(\ln n+1) approximation for Set Cover.
Proof. Since z i ( 1 1 k ) i n z i 1 1 k i n z_(i) <= (1-(1)/(k^(**)))^(i)*nz_{i} \leq\left(1-\frac{1}{k^{*}}\right)^{i} \cdot n, after t = k ln n k t = k ln n k t=|~k^(**)ln (n)/(k^(**))~|t=\left\lceil k^{*} \ln \frac{n}{k^{*}}\right\rceil steps,
z t n ( 1 1 / k ) k ln n k n e ln n k k . z t n 1 1 / k k ln n k n e ln n k k . z_(t) <= n(1-1//k^(**))^(k^(**)ln (n)/(k^(**))) <= ne^(-ln (n)/(k^(**))) <= k^(**).z_{t} \leq n\left(1-1 / k^{*}\right)^{k^{*} \ln \frac{n}{k^{*}}} \leq n e^{-\ln \frac{n}{k^{*}}} \leq k^{*}.
Thus, after t t tt steps, at most k k k^(**)k^{*} elements are left to be covered. Since Greedy Cover picks at least one element in each step, it covers all the elements after picking at most k ln n k + k k ( ln n + 1 ) k ln n k + k k ( ln n + 1 ) |~k^(**)ln (n)/(k^(**))~|+k^(**) <= k^(**)(ln n+1)\left\lceil k^{*} \ln \frac{n}{k^{*}}\right\rceil+k^{*} \leq k^{*}(\ln n+1) sets.
A useful special case of Set Cover is when all sets are "small". Does the approximation bound for Greedy improve? We can prove the following corollary via Lemma 2.2.
Corollary 2.3. If each set in the set system has at most d elements, then Greedy Cover is a ( ln d + 1 ) ( ln d + 1 ) (ln d+1)(\ln d+1) approximation for Set Cover.
Proof. If each set has at most d d dd elements then we have that k n d k n d k^(**) >= (n)/(d)k^{*} \geq \frac{n}{d} and hence ln n k ln d ln n k ln d ln (n)/(k^(**)) <= ln d\ln \frac{n}{k^{*}} \leq \ln d. Then the claim follows from Lemma 2.2.
Theorem 10.3 follows directly from Lemma 2.1 and 2.2.
A near-tight example for Greedy Cover when applied on Set Cover: Let us consider a set U U U\mathcal{U} of n n nn elements along with a collection S S S\mathcal{S} of k + 2 k + 2 k+2k+2 subsets { R 1 , R 2 , C 1 , C 2 , , C k } R 1 , R 2 , C 1 , C 2 , , C k {R_(1),R_(2),C_(1),C_(2),dots,C_(k)}\left\{R_{1}, R_{2}, C_{1}, C_{2}, \ldots, C_{k}\right\} of U U U\mathcal{U}. Let us also assume that | C i | = 2 i C i = 2 i |C_(i)|=2^(i)\left|C_{i}\right|=2^{i} and | R 1 C i | = R 1 C i = |R_(1)nnC_(i)|=\left|R_{1} \cap C_{i}\right|= | R 2 C i | = 2 i 1 ( 1 i k ) R 2 C i = 2 i 1 ( 1 i k ) |R_(2)nnC_(i)|=2^(i-1)(1 <= i <= k)\left|R_{2} \cap C_{i}\right|=2^{i-1}(1 \leq i \leq k), as illustrated in Fig. 2.1.
Clearly, the optimal solution consists of only two sets, i.e., R 1 R 1 R_(1)R_{1} and R 2 R 2 R_(2)R_{2}. Hence, OPT = 2 OPT = 2 OPT=2\operatorname{OPT} =2. However, Greedy Cover will pick each of the remaining k k kk sets, namely C k , C k 1 , , C 1 C k , C k 1 , , C 1 C_(k),C_(k-1),dots,C_(1)C_{k}, C_{k-1}, \ldots, C_{1}. Since n = 2 i = 0 k 1 2 i = 2 ( 2 k 1 ) n = 2 i = 0 k 1 2 i = 2 2 k 1 n=2*sum_(i=0)^(k-1)2^(i)=2*(2^(k)-1)n=2 \cdot \sum_{i=0}^{k-1} 2^{i}=2 \cdot\left(2^{k}-1\right), we get k = Ω ( log n ) k = Ω ( log n ) k=Omega(log n)k=\Omega(\log n). One can construct tighter examples with more involved analysis.
Figure 2.1: A near-tight example for Greedy Cover when applied on Set Cover
Exercise 2.1. Consider the weighted version of the Set Cover problem where a weight function w : S R + w : S R + w:SrarrR^(+)w: \mathcal{S} \rightarrow \mathcal{R}^{+} is given, and we want to select a collection S S S^(')\mathcal{S}^{\prime} of subsets from S S S\mathcal{S} such that X S X = U X S X = U uu_(X inS^('))X=U\cup_{X \in \mathcal{S}^{\prime}} X=\mathcal{U}, and X S w ( X ) X S w ( X ) sum_(X inS^('))w(X)\sum_{X \in \mathcal{S}^{\prime}} w(X) is the minimum. One can generalize the greedy heuristic in the natural fashion where in each iteration the algorithm picks the set that maximizes the ratio of the number of elements to its weight. Adapt the unweighted analysis to prove that the greedy algorithm yields an O ( ln n ) O ( ln n ) O(ln n)O(\ln n) approximation for the weighted version (you can be sloppy with the constant in front of ln n ) ln n ) ln n)\ln n).

2.1.3 Dominating Set

A dominating set in a graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) is a set S V S V S sube VS \subseteq V such that for each u V u V u in Vu \in V, either u S u S u in Su \in S, or some neighbor v v vv of u u uu is in S S SS. In other words S S SS covers/dominates all the nodes in V V VV. In the Dominating Set problem, the input is a graph G G GG and the goal is to find a smallest sized dominating set in G G GG.
Exercise 2.2. 1. Show that Dominanting Set is a special case of Set Cover.
  1. What is the greedy heuristi when applied to Dominating Set. Prove that it yields an ( ln ( Δ + 1 ) + 1 ) ( ln ( Δ + 1 ) + 1 ) (ln(Delta+1)+1)(\ln (\Delta+1)+1) approximation where Δ Δ Delta\Delta is the maximum degree in the graph.
  2. Show that Set Cover can be reduced in an approximation preserving fashion to Dominating Set. More formally, show that if Dominating Set has an α ( n ) α ( n ) alpha(n)\alpha(n)-approximation where n n nn is the number of vertices in the given instance then Set Cover has an ( 1 o ( 1 ) ) α ( n ) ( 1 o ( 1 ) ) α ( n ) (1-o(1))alpha(n)(1-o(1)) \alpha(n)-approximation.

2.2 Vertex Cover

We have already seen that the Vertex Cover problem is a special case of the Set Cover problem. The Greedy algorithm when specialized to Vertex Cover picks a highest degree vertex, removes it and the covered edges from the graph, and recurses in the remaining graph. It follows that the Greedy algorithm gives an O ( ln Δ + 1 ) O ( ln Δ + 1 ) O(ln Delta+1)O(\ln \Delta+1) approximation for the unweighted versions of the Vertex Cover problem. One can wonder whether the Greedy algorith has a better worstcase for Vertex Cover than the analysis suggests. Unfortunately the answer is negative and there are examples where the algorithm outputs a solution with Ω ( log n OPT ) Ω ( log n OPT ) Omega(log n*OPT)\Omega(\log n \cdot \operatorname{OPT} ) vertices.
We sketch the construction. Consider a bipartite graph G = ( U , V , E ) G = ( U , V , E ) G=(U,V,E)G=(U, V, E) where U = { u 1 , u 2 , , u h } . V U = u 1 , u 2 , , u h . V U={u_(1),u_(2),dots,u_(h)}.VU=\left\{u_{1}, u_{2}, \ldots, u_{h}\right\} . V is partitioned into S 1 , S 2 , , S h S 1 , S 2 , , S h S_(1),S_(2),dots,S_(h)S_{1}, S_{2}, \ldots, S_{h} where S i S i S_(i)S_{i} has h / i h / i |__ h//i __|\lfloor h / i\rfloor vertices. Each vertex v v vv in S i S i S_(i)S_{i} is connected to exactly i i ii distinct vertices of U U UU; thus, any vertex u j u j u_(j)u_{j} is incident to at most one edge from S i S i S_(i)S_{i}. It can be seen that the degree of each vertex u j U u j U u_(j)in Uu_{j} \in U is roughly h h hh. Clearly U U UU is a vertex cover of G G GG since the graph is bipartite. Convince yourself that the Greedy algorithm will pick all of V V VV starting with the lone vertex in S h S h S_(h)S_{h} (one may need to break ties to make this happen but the example can be easily perturbed to make this unnecessary). We have n = Θ ( h log h ) n = Θ ( h log h ) n=Theta(h log h)n=\Theta(h \log h) and OPT h OPT h OPT <= h\operatorname{OPT} \leq h and Greedy outputs a solution of size Ω ( h log h ) Ω ( h log h ) Omega(h log h)\Omega(h \log h).

2.2.1 A 2-approximation for Vertex Cover

There is a very simple 2 2 22-approximation algorithm for the Cardinality Vertex Cover problem.
Matching VC ( G ) _ Matching VC ( G ) _ "Matching"-VC(G)_\underline{\text{Matching}-\operatorname{VC}(G)}
  1. S S S larr O/S \leftarrow \emptyset
  2. Compute a maximal matching M M MM in G G GG
  3. for each edge ( u , v ) M ( u , v ) M (u,v)in M(u, v) \in M do
    • A.add both u u uu and v v vv to S S SS
  4. Output S S SS
Theorem 2.4. Matching V C Matching V C "Matching"-VC\textit{Matching}-VC is a 2 2 22-approximation algorithm.
The proof of Theorem 2.4 2.4 2.42.4 follows from two simple claims.
Claim 2.2.1. Let OPT OPT OPT\operatorname{OPT} be the size of the vertex cover in an optimal solution. Then OPT | M | OPT | M | OPT >= |M|\operatorname{OPT} \geq|M|.
Proof. Any vertex cover must contain at least one end point of each edge in M M MM since no two edges in M M MM intersect. Hence OPT | M | OPT | M | OPT >= |M|\operatorname{OPT} \geq|M|.
Claim 2.2.2. Let S ( M ) = { u , v ( u , v ) M } S ( M ) = { u , v ( u , v ) M } S(M)={u,v∣(u,v)in M}S(M)=\{u, v \mid(u, v) \in M\}. Then S ( M ) S ( M ) S(M)S(M) is a vertex cover.
Proof. If S ( M ) S ( M ) S(M)S(M) is not a vertex cover, then there must be an edge e E e E e in Ee \in E such that neither of its endpoints are in M M MM. But then e e ee can be included in M M MM, which contradicts the maximality of M M MM.
We now finish the proof of Theorem 2.4. Since S ( M ) S ( M ) S(M)S(M) is a vertex cover, Claim 2.2.1 implies that | S ( M ) | = 2 | M | 2 OPT | S ( M ) | = 2 | M | 2 OPT |S(M)|=2*|M| <= 2*OPT|S(M)|=2 \cdot|M| \leq 2 \cdot \operatorname{OPT}.
Weighted Vertex Cover: The matching based heuristic does not generalize in a straight forward fashion to the weighted case but 2 2 22-approximation algorithms for the Weighted Vertex Cover problem can be designed based on LP LP LP\operatorname{LP} rounding.

2.2.2 Set Cover with small frequencies

Vertex Cover is an instance of Set Cover where each element in U U U\mathcal{U} is in at most two sets (in fact, each element was in exactly two sets). This special case of the Set Cover problem admits a 2-approximation algorithm. What would be the case if every element is contained in at most three sets? More generally, given an instance of Set Cover, for each e U e U e inUe \in \mathcal{U}, let f ( e ) f ( e ) f(e)f(e) denote the number of sets containing e e ee. Let f = max e f ( e ) f = max e f ( e ) f=max_(e)f(e)f=\max _{e} f(e), which we call the maximum frequency.
Exercise 2.3. Give an f f ff-approximation for Set Cover, where f f ff is the maximum frequency of an element. Hint: Follow the approach used for Vertex Cover.

2.3 Vertex Cover via LP

Let G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) be an undirected graph with arc weights w : V R + w : V R + w:V rarrR^(+)w: V \rightarrow R^{+}. We can formulate Vertex Cover as an integer linear programming problem as follows. For each vertex v v vv we have a variable x v x v x_(v)x_{v}. We interpret the variable as follows: if x v = 1 x v = 1 x_(v)=1x_{v}=1 if v v vv is chosen to be included in a vertex cover, otherwise x v = 0 x v = 0 x_(v)=0x_{v}=0. With this interprtation we can easily see that the minimum weight vertex cover can be formulated as the following integer linear program.
min v V w v x v subject to x u + x v 1 e = ( u , v ) E x v { 0 , 1 } v V min v V w v x v subject to x u + x v 1 e = ( u , v ) E x v { 0 , 1 } v V {:[minsum_(v in V)w_(v)x_(v)],["subject to"],[x_(u)+x_(v) >= 1quad AA e=(u","v)in E],[x_(v) in{0","1}quad AA v in V]:}\begin{aligned} \min \sum_{v \in V} &w_{v} x_{v} \\ \text{subject to}& \\ x_{u}+x_{v} & \geq 1 \quad \forall e=(u, v) \in E \\ x_{v} &\in\{0,1\} \quad \forall v \in V \end{aligned}
However, solving the preceding integer linear program is NP NP NP\operatorname{NP}-Hard since it would solve Vertex Cover exactly. Therefore we use Linear Programming ( LP LP LP\operatorname{LP}) to approximate the optimal solution, OPT ( I ) OPT ( I ) OPT(I)\operatorname{OPT}(I), for the integer program. First, we can relax the constraint x v { 0 , 1 } x v { 0 , 1 } x_(v)in{0,1}x_{v} \in\{0,1\} to x v [ 0 , 1 ] x v [ 0 , 1 ] x_(v)in[0,1]x_{v} \in[0,1]. It can be further simplified to x v 0 , v V x v 0 , v V x_(v) >= 0,AA v in Vx_{v} \geq 0, \forall v \in V.
Thus, a linear programming formulation for Vertex Cover is:
min v V w v x v subject to x u + x v 1 e = ( u , v ) E x v 0 min v V w v x v subject to x u + x v 1 e = ( u , v ) E x v 0 {:[minsum_(v in V)w_(v)x_(v)],["subject to"],[x_(u)+x_(v) >= 1quad AA e=(u","v)in E],[x_(v) >= 0]:}\begin{aligned} \min \sum_{v \in V} &w_{v} x_{v} & \\ \text {subject to}& \\ x_{u}+x_{v} & \geq 1 \quad \forall e=(u, v) \in E \\ x_{v} & \geq 0 \end{aligned}
We now use the following algorithm:
Vertex Cover via LP _ Vertex Cover via  LP _ "Vertex Cover via "LP_\underline{\text{Vertex Cover via } \operatorname{LP}}
  1. Solve LP LP LP\operatorname{LP} to obtain an optimal fractional solution x x x^(**)x^{*}
  2. Let S = { v x v 1 2 } S = v x v 1 2 S={v∣x_(v)^(**) >= (1)/(2)}S=\left\{v \mid x_{v}^{*} \geq \frac{1}{2}\right\}
  3. Output S S SS
Claim 2.3.1. S S SS is a vertex cover.
Proof. Consider any edge, e = ( u , v ) e = ( u , v ) e=(u,v)e=(u, v). By feasibility of x , x u + x v 1 x , x u + x v 1 x^(**),x_(u)^(**)+x_(v)^(**) >= 1x^{*}, x_{u}^{*}+x_{v}^{*} \geq 1, and thus x u 1 2 x u 1 2 x_(u)^(**) >= (1)/(2)x_{u}^{*} \geq \frac{1}{2} or x v 1 2 x v 1 2 x_(v)^(**) >= (1)/(2)x_{v}^{*} \geq \frac{1}{2}. Therefore, at least one of u u uu and v v vv will be in S S SS.
Claim 2.3.2. w ( S ) 2 OPT L P ( I ) w ( S ) 2 OPT L P ( I ) w(S) <= 2OPT_(LP)(I)w(S) \leq 2 \operatorname{OPT}_{LP}(I).
Proof. OPT L P ( I ) = v w v x v 1 2 v S w v = 1 2 w ( S ) OPT L P ( I ) = v w v x v 1 2 v S w v = 1 2 w ( S ) OPT_(LP)(I)=sum_(v)w_(v)x_(v)^(**) >= (1)/(2)sum_(v in S)w_(v)=(1)/(2)w(S)\operatorname{OPT}_{L P}(I)=\sum_{v} w_{v} x_{v}^{*} \geq \frac{1}{2} \sum_{v \in S} w_{v}=\frac{1}{2} w(S).
Therefore, OPT L P ( I ) OPT ( I ) 2 OPT L P ( I ) OPT ( I ) 2 OPT_(LP)(I) >= (OPT(I))/(2)\operatorname{OPT}_{LP}(I) \geq \frac{\operatorname{OPT}(I)}{2} for all instances I I II.
Remark 2.2. For minimization problems: OPT L P ( I ) OPT ( I ) OPT L P ( I ) OPT ( I ) OPT_(LP)(I) <= OPT(I)\operatorname{OPT}_{LP}(I) \leq \operatorname{OPT}(I), where OPT L P ( I ) OPT L P ( I ) OPT_(LP)(I)\operatorname{OPT}_{LP}(I) is the optimal solution found by LP; for maximization problems, OPT L P ( I ) OPT ( I ) OPT L P ( I ) OPT ( I ) OPT_(LP)(I) >= OPT(I)\operatorname{OPT}_{LP}(I) \geq \operatorname{OPT} (I).
Integrality Gap: We introduce the notion of integrality gap to show the best approximation guarantee we can obtain if we only use the LP LP LP\operatorname{LP} values as a lower bound.
Definition 2.5. For a minimization problem Π Π Pi\Pi, the integrality gap for a linear programming relaxation/formulation L P L P LPL P for Π Π Pi\Pi is sup I π OPT ( I ) OPT LP ( I ) sup I π OPT ( I ) OPT LP ( I ) s u p_(I in pi)(OPT(I))/(OPT_(LP)(I))\sup _{I \in \pi} \frac{\operatorname{OPT}(I)}{\operatorname{OPT}_{\operatorname{LP}}(I)}.
That is, the integrality gap is the worst case ratio, over all instances I I II of Π Π Pi\Pi, of the integral optimal value and the fractional optimal value. Note that different linear programming formulations for the same problem may have different integrality gaps.
Claims 2.3.1 and 2.3.2 show that the integrality gap of the Vertex Cover LP LP LP\operatorname{LP} formulation above is at most 2 2 22.
Question: Is this bound tight for the Vertex Cover LP LP LP\operatorname{LP}?
Consider the following example: Take a complete graph, K n K n K_(n)K_{n}, with n n nn vertices, and each vertex has w v = 1 w v = 1 w_(v)=1w_{v}=1. It is clear that we have to choose n 1 n 1 n-1n-1 vertices to cover all the edges. Thus, OPT ( K n ) = n 1 OPT K n = n 1 OPT(K_(n))=n-1\operatorname{OPT}\left(K_{n}\right)=n-1. However, x v = 1 2 x v = 1 2 x_(v)=(1)/(2)x_{v}=\frac{1}{2} for each v v vv is a feasible solution to the LP LP LP\operatorname{LP}, which has a total weight of n 2 n 2 (n)/(2)\frac{n}{2}. So gap is 2 1 n 2 1 n 2-(1)/(n)2-\frac{1}{n}, which tends to 2 2 22 as n n n rarr oon \rightarrow \infty. One can also prove that the integrality gap is essentially 2 2 22 even in a class of sparse graphs.
Exercise 2.4. The vertex cover problem can be solved optimally in polynomial time in bipartite graphs. In fact the LP LP LP\operatorname{LP} is integral. Prove this via the maxflowmincut theorem and the integrality of flows when capacities are integral.
Other Results on Vertex Cover
  1. The current best approximation ratio for Vertex Cover is 2 Θ ( 1 log n ) 2 Θ 1 log n 2-Theta((1)/(sqrt(log n)))2-\Theta\left(\frac{1}{\sqrt{\log n}}\right) [3].
  2. It is known that unless P = N P P = N P P=NPP=N P there is α α alpha\alpha-approximation for Vertex Cover for α < 1.36 α < 1.36 alpha < 1.36\alpha<1.36 [4]. Under a stronger hypothesis called the Unique Games Conjecture it is known that there is no 2 ϵ 2 ϵ 2-epsilon2-\epsilon approximation for any fixed ϵ > 0 ϵ > 0 epsilon > 0\epsilon>0 [5].
  3. There is a polynomial time approximation scheme ( PTAS PTAS PTAS\operatorname{PTAS}), that is a ( 1 + ϵ ) ( 1 + ϵ ) (1+epsilon)(1+\epsilon) approximation for any fixed ϵ > 0 ϵ > 0 epsilon > 0\epsilon>0, for planar graphs. This follows from a general approach due to Baker [6]. The theorem extends to more general classes of graphs.

2.4 Set Cover via LP

The input to the Set Cover problem consists of a finite set U = { 1 , 2 , , n } U = { 1 , 2 , , n } U={1,2,dots,n}U=\{1,2, \ldots, n\}, and m m mm subsets of U , S 1 , S 2 , , S m U , S 1 , S 2 , , S m U,S_(1),S_(2),dots,S_(m)U, S_{1}, S_{2}, \ldots, S_{m}. Each set S j S j S_(j)S_{j} has a non-negative weigh w j w j w_(j)w_{j} and the goal is to find the minimum weight collection of sets which cover all elements in U U UU (in other words their union is U U UU ).
A linear programming relaxation for Set Cover is:
min j w j x j subject to j : i S j x j 1 i { 1 , 2 , , n } x j 0 1 j m min j w j x j subject to j : i S j x j 1 i { 1 , 2 , , n } x j 0 1 j m {:[quad minsum_(j)w_(j)x_(j)],["subject to"],[sum_(j:i inS_(j))x_(j) >= 1quad AA i in{1","2","dots","n}],[x_(j) >= 0quad1 <= j <= m]:}\begin{aligned} \quad \min & \sum_{j} w_{j} x_{j} \\ \text {subject to} & \\ \sum_{j: i \in S_{j}} x_{j} & \geq 1 \quad \forall i \in\{1,2, \ldots, n\} \\ x_{j} & \geq 0 \quad 1 \leq j \leq m \end{aligned}
And its dual is:
max i = 1 n y i subject to i S j y i w j j { 1 , 2 , , m } y i 0 i 1 , 2 , , n max i = 1 n y i subject to i S j y i w j j { 1 , 2 , , m } y i 0 i 1 , 2 , , n {:[maxsum_(i=1)^(n)y_(i)],["subject to"],[sum_(i inS_(j))y_(i) <= w_(j)quad AA j in{1","2","dots","m}],[y_(i) >= 0quad AA i in1","2","dots","n]:}\begin{aligned} \max & \sum_{i=1}^{n} y_{i} \\ \text {subject to} & \\ \sum_{i \in S_{j}} y_{i} & \leq w_{j} \quad \forall j \in\{1,2, \ldots, m\} \\ y_{i} & \geq 0 \quad \forall i \in 1,2, \ldots, n \end{aligned}
We give several algorithms for Set Cover based on this primal/dual pair LP LP LP\operatorname{LP}s.

2.4.1 Deterministic Rounding

Set Cover via LP _ Set Cover via  LP _ "Set Cover via "LP_\underline{\text {Set Cover via }\operatorname{LP}}
  1. Solve LP LP LP\operatorname{LP} to obtain an optimal solution x x x^(**)x^{*}, which contains fractional numbers.
  2. Let P = { i x i > 0 } P = i x i > 0 P={i∣x_(i)^(**) > 0}P=\left\{i \mid x_{i}^{*}>0\right\}
  3. Output { S j j P } S j j P {S_(j)∣j in P}\left\{S_{j} \mid j \in P\right\}
Note that the above algorithm, even when specialized to Vertex Cover is different from the one we saw earlier. It includes all sets which have a strictly positive value in an optimum solution to the LP LP LP\operatorname{LP}.
Let x x x^(**)x^{*} be an optimal solution to the primal LP LP LP\operatorname{LP}, y y y^(**)y^{*} be an optimum solution to the dual, and let P = { j x j > 0 } P = j x j > 0 P={j∣x_(j)^(**) > 0}P=\left\{j \mid x_{j}^{*}>0\right\}. First, note that by strong duality, j w j x j = i y i j w j x j = i y i sum_(j)w_(j)x_(j)^(**)=sum_(i)y_(i)^(**)\sum_{j} w_{j} x_{j}^{*}=\sum_{i} y_{i}^{*}. Second, by complementary slackness if x j > 0 x j > 0 x_(j)^(**) > 0x_{j}^{*}>0 then the corresponding dual constraint is tight, that is i S j y i = w j i S j y i = w j sum_(i inS_(j))y_(i)^(**)=w_(j)\sum_{i \in S_{j}} y_{i}^{*}=w_{j}.
Claim 2.4.1. The output of the algorithm is a feasible set cover for the given instance.
Proof. Exercise.
Claim 2.4.2. j P w j f j w j x j = OPT L P j P w j f j w j x j = OPT L P sum_(j in P)w_(j) <= fsum_(j)w_(j)x_(j)^(**)=OPT_(LP)\sum_{j \in P} w_{j} \leq f \sum_{j} w_{j} x_{j}^{*}=\operatorname{OPT}_{L P}.
Proof.
j P w j = j : x j > 0 w j = j : x j > 0 ( i S j y i ) = i y i ( j : i S j , x j > 0 1 ) f i y i = f OPT L P ( I ) . j P w j = j : x j > 0 w j = j : x j > 0 i S j y i = i y i j : i S j , x j > 0 1 f i y i = f OPT L P ( I ) . sum_(j in P)w_(j)=sum_(j:x_(j)^(**) > 0)w_(j)=sum_(j:x_(j)^(**) > 0)(sum_(i inS_(j))y_(i)^(**))=sum_(i)y_(i)^(**)(sum_(j:i inS_(j),x_(j)^(**) > 0)1) <= fsum_(i)y_(i)^(**)=fOPT_(LP)(I).\sum_{j \in P} w_{j}=\sum_{j: x_{j}^{*}>0} w_{j}=\sum_{j: x_{j}^{*}>0}\left(\sum_{i \in S_{j}} y_{i}^{*}\right)=\sum_{i} y_{i}^{*}\left(\sum_{j: i \in S_{j}, x_{j}^{*}>0} 1\right) \leq f \sum_{i} y_{i}^{*}=f \operatorname{OPT}_{L P}(I) .
Notice that the the second equality is due to complementary slackness conditions (if x j > 0 x j > 0 x_(j)^(**) > 0x_{j}^{*}>0, the corresponding dual constraint is tight), the penultimate inequality uses the definition of f f ff, and the last inequality follows from weak duality (a feasible solution for the dual problem is a lower bound on the optimal primal solution).
Therefore we have that the algorithm outputs a cover of weight at most f OPT L P f OPT L P fOPT_(LP)f \operatorname{OPT}_{L P}. We note that f f ff can be as large as n n nn in which case the bound given by the algorithm is quite weak. In fact, it is not hard to construct examples that demonstrate the tightness of the analysis.
Remark 2.3. The analysis cruically uses the fact that x x x^(**)x^{*} is an optimal solution. On the other hand the algorithm for Vertex Cover is more robust and works with any feasible solution x x xx. It is easy to generalize the earlier rounding for Vertex Cover to obtain an f f ff-approximation. The point of the above rounding is to illustrate the utility of complementary slackness.

2.4.2 Randomized Rounding

Now we describe a different rounding that yields an approximation bound that does not depend on f f ff.
Set Cover via Randomized Rounding _ Set Cover via Randomized Rounding _ "Set Cover via Randomized Rounding"_\underline{\text {Set Cover via Randomized Rounding}}
  1. A = A = A=O/A=\emptyset, and let x x x^(**)x^{*} be an optimal solution to the LP LP LP\operatorname{LP}
  2. for k = 1 k = 1 k=1k=1 to 2 ln n 2 ln n 2ln n2 \ln n do
    • A.pick each S j S j S_(j)S_{j} independently with probability x j x j x_(j)^(**)x_{j}^{*}
    • B.if S j S j S_(j)S_{j} is picked, A = A { j } A = A { j } A=A uu{j}A=A \cup\{j\}
  3. Output the sets with indices in A A AA
Claim 2.4.3. P [ i is not covered in an iteration ] = j : i S j ( 1 x j ) 1 e P [ i  is not covered in an iteration ] = j : i S j 1 x j 1 e P[i" is not covered in an iteration"]=prod_(j:i inS_(j))(1-x_(j)^(**)) <= (1)/(e)\mathrm{P}[i\textit{ is not covered in an iteration}]=\prod_{j: i \in S_{j}}\left(1-x_{j}^{*}\right) \leq \frac{1}{e}.
Intuition: We know that j : i S j x j 1 j : i S j x j 1 sum_(j:i inS_(j))x_(j)^(**) >= 1\sum_{j: i \in S_{j}} x_{j}^{*} \geq 1. Subject to this constraint, if we want to minimize the probability that element i i ii is covered, one can see that the minimum is achieved with x j = 1 / x j = 1 / x_(j)^(**)=1//ℓx_{j}^{*}=1 / \ell for each set S j S j S_(j)S_{j} that covers i i ii; here \ell is the number of sets that cover i i ii. Then the probability is ( 1 1 ) 1 1 (1-(1)/(ℓ))^(ℓ)\left(1-\frac{1}{\ell}\right)^{\ell}.
Proof. We use the inequality ( 1 x ) e x ( 1 x ) e x (1-x) <= e^(-x)(1-x) \leq e^{-x} for all x [ 0 , 1 ] x [ 0 , 1 ] x in[0,1]x \in[0,1].
P [ i is not covered in an iteration ] = j : i S j ( 1 x j ) j : i S j e x j e j : i S j x j 1 e . P [ i  is not covered in an iteration ] = j : i S j 1 x j j : i S j e x j e j : i S j x j 1 e . P[i" is not covered in an iteration"]=prod_(j:i inS_(j))(1-x_(j)^(**)) <= prod_(j:i inS_(j))e^(-x_(j)^(**)) <= e^(-sum_(j:i inS_(j))x_(j)^(**)) <= (1)/(e).\mathbf{P}[i \text{ is not covered in an iteration}]=\prod_{j: i \in S_{j}}\left(1-x_{j}^{*}\right) \leq \prod_{j: i \in S_{j}} e^{-x_{j}^{*}} \leq e^{-\sum_{j: i \in S_{j}} x_{j}^{*}} \leq \frac{1}{e} .
We then obtain the following corollaries:
Corollary 2.6. P [ i is not covered at the end of the algorithm ] e 2 log n 1 n 2 P [ i  is not covered at the end of the algorithm ] e 2 log n 1 n 2 P[i" is not covered at the end of the algorithm"] <= e^(-2log n) <= (1)/(n^(2))\mathbf{P}[i\textit{ is not covered at the end of the algorithm}] \leq e^{-2 \log n} \leq \frac{1}{n^{2}}.
Corollary 2.7. P [ all elements are covered, after the algorithm stops ] 1 1 n P [ all elements are covered, after the algorithm stops ] 1 1 n P["all elements are covered, after the algorithm stops"] >= 1-(1)/(n)\mathbf{P}[\textit{all elements are covered, after the algorithm stops}] \geq 1-\frac{1}{n}.
Proof. Via the union bound. The probability that i i ii is not covered is at most 1 / n 2 1 / n 2 1//n^(2)1 / n^{2}, hence the probability that there is some i i ii that is not covered is at most n 1 / n 2 1 / n n 1 / n 2 1 / n n*1//n^(2) <= 1//nn \cdot 1 / n^{2} \leq 1 / n.
Now we bound the expected cost of the algorithm. Let C t = C t = C_(t)=C_{t}= cost of sets picked in iteration t t tt, then E [ [ C t ] = j = 1 m w j x j E C t = j = 1 m w j x j E[[C_(t)]=sum_(j=1)^(m)w_(j)x_(j)^(**):}\mathbf{E}\left[\left[C_{t}\right]=\sum_{j=1}^{m} w_{j} x_{j}^{*}\right., where E [ X ] E [ X ] E[X]\mathbf{E}[X] denotes the expectation of a random variable X X XX. Then, let C = t = 1 2 ln n C t C = t = 1 2 ln n C t C=sum_(t=1)^(2ln n)C_(t)C=\sum_{t=1}^{2 \ln n} C_{t}; we have E [ C ] = E [ C ] = E[C]=\mathbf{E}[C]= t = 1 2 ln n E [ C t ] 2 ln n OPT L P t = 1 2 ln n E C t 2 ln n OPT L P sum_(t=1)^(2ln n)E[C_(t)] <= 2ln nOPT_(LP)\sum_{t=1}^{2 \ln n} \mathbf{E}\left[C_{t}\right] \leq 2 \ln n \operatorname{OPT}_{L P}. By Markov's inequality, P [ C > 2 E [ C ] ] 1 2 P [ C > 2 E [ C ] ] 1 2 P[C > 2E[C]] <= (1)/(2)\mathbf{P}[\mathrm{C}>2 \mathbf{E}[\mathrm{C}]] \leq \frac{1}{2}, hence P [ C 4 ln n OPT L P ] 1 2 P C 4 ln n OPT L P 1 2 P[C <= 4ln nOPT_(LP)] >= (1)/(2)\mathbf{P}\left[C \leq 4 \ln n \operatorname{OPT}_{L P}\right] \geq \frac{1}{2}. Therefore, P [ C 4 ln n OPT L P P C 4 ln n OPT L P P[C <= 4ln nOPT_(LP):}\mathbf{P}\left[C \leq 4 \ln n \operatorname{OPT}_{L P}\right. and all items are covered ] ] ] >=] \geq 1 2 1 n 1 2 1 n (1)/(2)-(1)/(n)\frac{1}{2}-\frac{1}{n}. Thus, the randomized rounding algorithm, with probability close to 1 / 2 1 / 2 1//21 / 2 succeeds in giving a feasible solution of cost O ( log n ) OPT L P O ( log n ) OPT L P O(log n)OPT_(LP)O(\log n) \operatorname{OPT}_{L P}. Note that we can check whether the solution satisfies the desired properties (feasibility and cost) and repeat the algorithm if it does not.
  1. We can check if solution after rounding satisfies the desired properties, such as all elements are covered, or cost at most 2 c log n OPT L P 2 c log n OPT L P 2c log nOPT_(LP)2 c \log n \operatorname{OPT}_{L P}. If not, repeat rounding. Expected number of iterations to succeed is a constant.
  2. We can also use Chernoff bounds (large deviation bounds) to show that a single rounding succeeds with high probability (probability at least 1 1 poly ( n ) 1 1  poly  ( n ) 1-(1)/(" poly "(n))1-\frac{1}{\text { poly }(n)} ).
  3. The algorithm can be derandomized. Derandomization is a technique of removing randomness or using as little randomness as possible. There are many derandomization techniques, such as the method of conditional expectation, discrepancy theory, and expander graphs.
  4. After a few rounds, select the cheapest set that covers each uncovered element. This has low expected cost. This algorithm ensures feasibility but guarantees cost only in the expected sense. We will see a variant on the homework.
Randomized Rounding with Alteration: In the preceding analysis we had to worry about the probability of covering all the elements and the expected cost of the solution. Here we illustrate a simple yet powerful technique of alteration in randomized algorithms and analysis. Let d d dd be the maximum set size.
Set Cover: Randomized Rounding With Alteration _ Set Cover: Randomized Rounding With Alteration _ "Set Cover: Randomized Rounding With Alteration"_\underline{\text{Set Cover: Randomized Rounding With Alteration}}
  1. A = A = A=O/A=\emptyset, and let x x x^(**)x^{*} be an optimal solution to the LP LP LP\operatorname{LP}
  2. Add to A A AA each S j S j S_(j)S_{j} independently with probability min { 1 min 1 min{1:}\min \left\{1\right., ln d x j } ln d x j {: ln d*x_(j)^(**)}\left.\ln d \cdot x_{j}^{*}\right\}
  3. Let U U U^(')\mathcal{U}^{\prime} be the elements uncovered by the chosen sets in A A AA
  4. For each uncovered element i U i U i inU^(')i \in \mathcal{U}^{\prime} do
    • A.Add to A A AA the cheapest set that covers i i ii
  5. Output the sets with indices in A A AA
The algorithm has two phases. A randomized phase and a fixing/altering phase. In the second phase we apply a naive algorithm that may have a high cost in the worst case but we will bound its expected cost appropriately. The algorithm deterministically guarantees that all elements will be covered, and hence we only need to focus on the expected cost of the chosen sets. Let C 1 C 1 C_(1)C_{1} be the random cost of the sets chosen in the first phase and let C 2 C 2 C_(2)C_{2} be the random cost of the sets chosen in the second phase. It is easy to see that E [ C 1 ] = ln d j w j x j = ln d OPT L P E C 1 = ln d j w j x j = ln d OPT L P E[C_(1)]=ln dsum_(j)w_(j)x_(j)^(**)=ln dOPT_(LP)\mathbf{E}\left[C_{1}\right]=\ln d \sum_{j} w_{j} x_{j}^{*}=\ln d \operatorname{OPT}_{L P}. Let E i E i E_(i)\mathcal{E}_{i} be the event that element i i ii is not covered after the first randomized phase.
Exercise 2.5. P [ E i ] e ln d 1 / d P E i e ln d 1 / d P[E_(i)] <= e^(-ln d) <= 1//d\mathbf{P}\left[\mathcal{E}_{i}\right] \leq e^{-\ln d} \leq 1 / d.
The worst case second phase cost can be upper bounded via the next lemma.
Lemma 2.3. Let β i β i beta_(i)\beta_{i} be the cost of the cheapest set covering i i ii. Then i β i d OPT L P i β i d OPT L P sum_(i)beta_(i) <= dOPT_(LP)\sum_{i} \beta_{i} \leq d \operatorname{OPT}_{L P}.
Proof. Consider an element i i ii. We have the constraint that j : i S j x j 1 j : i S j x j 1 sum_(j:i inS_(j))x_(j)^(**) >= 1\sum_{j: i \in S_{j}} x_{j}^{*} \geq 1. Since each set covering i i ii has cost at least β i β i beta_(i)\beta_{i}, we have j : i S j c j x j β i j : i S j x j β i j : i S j c j x j β i j : i S j x j β i sum_(j:i inS_(j))c_(j)x_(j)^(**) >= beta_(i)sum_(j:i inS_(j))x_(j)^(**) >= beta_(i)\sum_{j: i \in S_{j}} c_{j} x_{j}^{*} \geq \beta_{i} \sum_{j: i \in S_{j}} x_{j}^{*} \geq \beta_{i}. Thus,
i β i i j : i S j c j x j j c j x j | S j | d j c j x j = d OPT L P . i β i i j : i S j c j x j j c j x j S j d j c j x j = d OPT L P . sum_(i)beta_(i) <= sum_(i)sum_(j:i inS_(j))c_(j)x_(j)^(**) <= sum_(j)c_(j)x_(j)^(**)|S_(j)| <= dsum_(j)c_(j)x_(j)^(**)=dOPT_(LP).\sum_{i} \beta_{i} \leq \sum_{i} \sum_{j: i \in S_{j}} c_{j} x_{j}^{*} \leq \sum_{j} c_{j} x_{j}^{*}\left|S_{j}\right| \leq d \sum_{j} c_{j} x_{j}^{*}=d \operatorname{OPT}_{L P} .
Now we bound the expected second phase cost.
Lemma 2.4. E [ C 2 ] OPT L P E C 2 OPT L P E[C_(2)] <= OPT_(LP)\mathbf{E}\left[\mathrm{C}_{2}\right] \leq \operatorname{OPT}_{L P}.
Proof. We pay for a set to cover element i i ii in the second phase only if it is not covered in the first phase. Hence C 2 = i E i β i C 2 = i E i β i C_(2)=sum_(i)E_(i)beta_(i)C_{2}=\sum_{i} \mathcal{E}_{i} \beta_{i}. Note that the events E i E i E_(i)\mathcal{E}_{i} for different elements i i ii are not necessarily independent, however, we can apply linearity of expectation.
E [ C 2 ] = i E [ E i ] β i = i P [ E i ] β i 1 / d i β i OPT L P . E C 2 = i E E i β i = i P E i β i 1 / d i β i OPT L P . E[C_(2)]=sum_(i)E[E_(i)]beta_(i)=sum_(i)P[E_(i)]beta_(i) <= 1//dsum_(i)beta_(i) <= OPT_(LP).\mathbf{E}\left[C_{2}\right]=\sum_{i} \mathbf{E}\left[\mathcal{E}_{i}\right] \beta_{i}=\sum_{i} \mathbf{P}\left[\mathcal{E}_{i}\right] \beta_{i} \leq 1 / d \sum_{i} \beta_{i} \leq \operatorname{OPT}_{L P} .
Combining the expected costs of the two phases we obtain the following theorem.
Theorem 2.8. Randomized rounding with alteration outputs a feasible solution of expected cost ( 1 + ln d ) OPT L P ( 1 + ln d ) OPT L P (1+ln d)OPT_(LP)(1+\ln d) \operatorname{OPT}_{L P}.
Note that the simplicity of the algorithm and tightness of the bound.
Remark 2.4. If d = 2 d = 2 d=2d=2 the Set Cover problem becomes the Edge Cover problem in a graph which is the following. Given an edge-weighted graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E), find the minimum weight subset of edges such that each vertex is covered. Edge Cover admits a polynomial-time algorithm via a reduction to the minimum-cost matching problem in a general graph. However d = 3 d = 3 d=3d=3 for Set Cover is NP NP NP\operatorname{NP}-Hard via a reduction from 3-D Matching.

2.4.3 Dual-fitting

In this section, we introduce the technique of dual-fitting for the analysis of approximation algorithms. At a high-level the approach is the following:
  1. Consider an algorithm that one wants to analyze.
  2. Construct a feasible solution to the dual LP LP LP\operatorname{LP} based on the structure of the algorithm.
  3. Show that the cost of the solution returned by the algorithm can be bounded in terms of the value of the dual solution.
Note that the algorithm itself need not be LP LP LP\operatorname{LP} based. Here, we use Set Cover as an example. See the previous section for the primal and dual LP LP LP\operatorname{LP} formulations for Set Cover.
We can interpret the dual as follows: Think of y i y i y_(i)y_{i} as how much element i i ii is willing to pay to be covered; the dual maximizes the total payment, subject to the constraint that for each set, the total payment of elements in that set is at most the cost of the set.
We rewrite the Greedy algorithm for Weighteed Set Cover.
Greedy Set Cover _ Greedy Set Cover _ "Greedy Set Cover"_\underline{\text{Greedy Set Cover}}
  1. Covered = Covered = "Covered"=O/\textit{Covered} =\emptyset
  2. A = A = A=O/A=\emptyset;
  3. While Covered U Covered U "Covered"!=U\textit{Covered} \neq U do
    • A. j arg min k ( w k S k Uncovered ) j arg min k w k S k Uncovered j larr arg min_(k)((w_(k))/(∣S_(k)nn"Uncovered"∣))j \leftarrow \arg \min _{k}\left(\frac{w_{k}}{\mid S_{k} \cap \textit {Uncovered} \mid}\right);
    • B. Covered = Covered S j Covered = Covered S j "Covered"="Covered"uuS_(j)\textit{Covered} = \textit{Covered} \cup S_{j};
    • C. A = A { j } A = A { j } A=A uu{j}A=A \cup\{j\}.
  4. end while;
  5. Output sets in A A AA as cover
Let H k = 1 + 1 / 2 + + 1 / k H k = 1 + 1 / 2 + + 1 / k H_(k)=1+1//2+dots+1//kH_{k}=1+1 / 2+\ldots+1 / k be the k k kk the Harmonic number. It is well known that H k 1 + ln k H k 1 + ln k H_(k) <= 1+ln kH_{k} \leq 1+\ln k.
Theorem 2.9. Greedy Set Cover picks a solution of cost H d OPT L P H d OPT L P <= H_(d)*OPT_(LP)\leq H_{d} \cdot \operatorname{OPT}_{L P}, where d d dd is the maximum set size, i.e., d = max j | S j | d = max j S j d=max_(j)|S_(j)|d=\max _{j}\left|S_{j}\right|*.
To prove this, we augment the algorithm to keep track of some additional information.
Augmented Greedy Algorithm Of Weighted Set Cover _ Augmented Greedy Algorithm Of Weighted Set Cover _ "Augmented Greedy Algorithm Of Weighted Set Cover"_\underline{\text {Augmented Greedy Algorithm Of Weighted Set Cover}}
  1. Covered = Covered = "Covered"=O/\textit{Covered} =\emptyset
  2. While Covered U Covered U "Covered"!=U\textit{Covered} \neq U do
    • A. j arg min k ( w k S k Uncovered ) j arg min k w k S k Uncovered j larr arg min_(k)((w_(k))/(∣S_(k)nn"Uncovered"∣))j \leftarrow \arg \min _{k}\left(\frac{w_{k}}{\mid S_{k} \cap \textit{Uncovered} \mid}\right)
    • B.if i i ii is uncovered and i S j i S j i inS_(j)i \in S_{j}, set p i = w j S j Uncovered p i = w j S j Uncovered p_(i)=(w_(j))/(∣S_(j)nn"Uncovered"∣)p_{i}=\frac{w_{j}}{\mid S_{j} \cap \textit{Uncovered} \mid};
    • C. Covered = Covered S j Covered = Covered S j "Covered"="Covered"uuS_(j)\textit{Covered} = \textit{Covered} \cup S_{j}
    • D. A = A { j } A = A { j } A=A uu{j}A=A\cup\{j\}.
  3. Output sets in A A AA as cover
It is easy to see that the algorithm outputs a feasible cover.
Claim 2.4.4. j A w j = i p i j A w j = i p i sum_(j in A)w_(j)=sum_(i)p_(i)\sum_{j \in A} w_{j}=\sum_{i} p_{i}.
Proof. Consider when j j jj is added to A A AA. Let S j S j S j S j S_(j)^(')subeS_(j)S_{j}^{\prime} \subseteq S_{j} be the elements that are uncovered before j j jj is added. For each i S j i S j i inS_(j)^(')i \in S_{j}^{\prime} the algorithm sets p i = w j / | S j | p i = w j / S j p_(i)=w_(j)//|S_(j)^(')|p_{i}=w_{j} /\left|S_{j}^{\prime}\right|. Hence, i S j p i = w j i S j p i = w j sum_(i inS_(j)^('))p_(i)=w_(j)\sum_{i \in S_{j}^{\prime}} p_{i}=w_{j}. Moreover, it is easy to see that the sets S j j A S j j A S_(j^('))^(')j in AS_{j^{\prime}}^{\prime} j \in A are disjoint and together partition U U UU. Therefore,
j A w j = j A i S j p i = i U p i . j A w j = j A i S j p i = i U p i . sum_(j in A)w_(j)=sum_(j in A)sum_(i inS_(j)^('))p_(i)=sum_(i in U)p_(i).\sum_{j \in A} w_{j}=\sum_{j \in A} \sum_{i \in S_{j}^{\prime}} p_{i}=\sum_{i \in U} p_{i} .
For each i i ii, let y i = 1 H d p i y i = 1 H d p i y_(i)^(')=(1)/(H_(d))p_(i)y_{i}^{\prime}=\frac{1}{H_{d}} p_{i}.
Claim 2.4.5. y y y^(')y^{\prime} is a feasible solution for the dual L P L P LPLP.
Suppose the claim is true, then the cost of Greedy Set Cover's solution = i p i = H d i y i H d OPT L P i p i = H d i y i H d OPT L P sum_(i)p_(i)=H_(d)sum_(i)y_(i)^(') <= H_(d)OPT_(LP)\sum_{i} p_{i}=H_{d} \sum_{i} y_{i}^{\prime} \leq H_{d} \operatorname{OPT}_{L P}. The last step is because any feasible solution for the dual problem is a lower bound on the value of the primal LP LP LP\operatorname{LP}(weak duality).
Now, we prove the claim. Let S j S j S_(j)S_{j} be an arbitrary set, and let | S j | = t d S j = t d |S_(j)|=t <= d\left|S_{j}\right|=t \leq d. Let S j = { i 1 , i 2 , , i t } S j = i 1 , i 2 , , i t S_(j)={i_(1),i_(2),dots,i_(t)}S_{j}=\left\{i_{1}, i_{2}, \ldots, i_{t}\right\}, where we the elements are ordered such that i 1 i 1 i_(1)i_{1} is covered by Greedy no-later than i 2 i 2 i_(2)i_{2}, and i 2 i 2 i_(2)i_{2} is covered no later than i 3 i 3 i_(3)i_{3} and so on.
Claim 2.4.6. For 1 h t , p i h w j t h + 1 1 h t , p i h w j t h + 1 1 <= h <= t,p_(i_(h)) <= (w_(j))/(t-h+1)1 \leq h \leq t, p_{i_{h}} \leq \frac{w_{j}}{t-h+1}.
Proof. Let S j S j S_(j^('))S_{j^{\prime}} be the set that covers i h i h i_(h)i_{h} in Greedy. When Greedy picked S j S j S_(j^('))S_{j^{\prime}} the elements i h , i h + 1 , , i t i h , i h + 1 , , i t i_(h),i_(h+1),dots,i_(t)i_{h}, i_{h+1}, \ldots, i_{t} from S j S j S_(j)S_{j} were uncovered and hence Greedy could have picked S j S j S_(j)S_{j} as well. This implies that the density of S j S j S_(j^('))S_{j^{\prime}} when it was picked was no more than w j t h + 1 w j t h + 1 (w_(j))/(t-h+1)\frac{w_{j}}{t-h+1}. Therefore p i h p i h p_(i_(h))p_{i_{h}} which is set to the density of S j S j S_(j^('))S_{j^{\prime}} is at most w j t h + 1 w j t h + 1 (w_(j))/(t-h+1)\frac{w_{j}}{t-h+1}.
From the above claim, we have
1 h t p i h 1 h t w j t h + 1 = w j H t w j H d . 1 h t p i h 1 h t w j t h + 1 = w j H t w j H d . sum_(1 <= h <= t)p_(i_(h)) <= sum_(1 <= h <= t)(w_(j))/(t-h+1)=w_(j)H_(t) <= w_(j)H_(d).\sum_{1 \leq h \leq t} p_{i_{h}} \leq \sum_{1 \leq h \leq t} \frac{w_{j}}{t-h+1}=w_{j} H_{t} \leq w_{j} H_{d} .
Thus, the setting of y i y i y_(i)^(')y_{i}^{\prime} to be p i p i p_(i)p_{i} scaled down by a factor of H d H d H_(d)H_{d} gives a feasible solution.

2.4.4 Greedy for implicit instances of Set Cover

Set Cover and the Greedy heuristic are quite useful in applications because many instances are implicit, nevertheless, the algorithm and the analysis applies. That is, the universe U U U\mathcal{U} of elements and the collection S S S\mathcal{S} of subsets of U U U\mathcal{U} need not be restricted to be finite or explicitly enumerated in the Set Cover problem. For instance, a problem could require covering a finite set of points in the plane using disks of unit radius. There is an infinite set of such disks, but the Greedy approximation algorithm can still be applied. For such implicit instances, the Greedy algorithm can be used if we have access to an oracle, which, at each iteration, selects a set having the optimal density. However, an oracle may not always be capable of selecting an optimal set. In some cases it may have to make the selections approximately. We call an oracle an α α alpha\alpha-approximate oracle for some α 1 α 1 alpha >= 1\alpha \geq 1 if, at each iteration, it selects a set S S SS such that w ( S ) S α min A in collection A w ( A ) w ( S ) S α min A  in collection A w ( A ) (w(S))/(S) <= alphamin_(A" in collection")(A)/(w(A))\frac{w(S)}{S} \leq \alpha \min _{A \text { in collection}} \frac{A}{w(A)}.
Exercise 2.6. Prove that the approximation guarantee of Greedy with an α α alpha\alpha approximate oracle would be α ( ln n + 1 ) α ( ln n + 1 ) alpha(ln n+1)\alpha(\ln n+1) for Set Cover, and ( 1 1 e α ) 1 1 e α (1-(1)/(e^(alpha)))\left(1-\frac{1}{e^{\alpha}}\right) for Maximum Coverage.
We will see several examples of implicit use of the greedy analysis in the course.

2.5 Submodularity

Set Cover turns out to be a special case of a more general problem called Submodular Set Cover. The Greedy algorithm and analysis applies in this more generality. Submodularity is a fundamental notion with many applications in combinatorial optimization and else where. Here we take the opportunity to provide some definitions and a few results.
Definition 2.10. Given a finite set E E EE, a real-valued set function f : 2 E R f : 2 E R f:2^(E)rarrRf: 2^{E} \rightarrow \mathbb{R} is submodular iff
f ( A ) + f ( B ) f ( A B ) + f ( A B ) A , B E . f ( A ) + f ( B ) f ( A B ) + f ( A B ) A , B E . f(A)+f(B) >= f(A uu B)+f(A nn B)quad AA A,B sube E.f(A)+f(B) \geq f(A \cup B)+f(A \cap B) \quad \forall A, B \subseteq E .
Alternatively, f f ff is a submodular function iff
f ( A { i } ) f ( A ) f ( B { i } ) f ( B ) A B , i E B . f ( A { i } ) f ( A ) f ( B { i } ) f ( B ) A B , i E B . f(A uu{i})-f(A) >= f(B uu{i})-f(B)quad AA A sub B,i in E\\B.f(A \cup\{i\})-f(A) \geq f(B \cup\{i\})-f(B) \quad \forall A \subset B, i \in E \backslash B .
The second characterization shows that submodularity is based on decreasing marginal utility property in the discrete setting. Adding element i i ii to a set A A AA will help at least as much as adding it to to a (larger) set B A B A B sup AB \supset A. It is common to use A + i A + i A+iA+i to denote A { i } A { i } A uu{i}A \cup\{i\} and A i A i A-iA-i to denote A { i } A { i } A\\{i}A \backslash\{i\}.
Exercise 2.7. Prove that the two characterizations of submodular functions are equivalent.
Many application of submodular functions are when f f ff is a non-negative function though there are several important applications when f f ff can be negative. A submodular function f ( ) f ( ) f(*)f(\cdot) is monotone if f ( A + i ) f ( A ) f ( A + i ) f ( A ) f(A+i) >= f(A)f(A+i) \geq f(A) for all i E i E i in Ei \in E and A E A E A sube EA \subseteq E. Typically one assumes that f f ff is normalized by which we mean that f ( ) = 0 f ( ) = 0 f(O/)=0f(\emptyset)=0; this can always be done by shifting the function by f ( ) . f f ( ) . f f(O/).ff(\emptyset) . f is symmetric if f ( A ) = f ( E A ) f ( A ) = f ( E A ) f(A)=f(E\\A)f(A)=f(E \backslash A) for all A E A E A sube EA \subseteq E. Submodular set functions arise in a large number of fields including combinatorial optimization, probability, and geometry. Examples include rank function of a matroid, the sizes of cutsets in a directed or undirected graph, the probability that a subset of events do not occur simultaneously, entropy of random variables, etc. In the following we show that the Set Cover and Maximum Coverage problems can be easily formulated in terms of submodular set functions.
Exercise 2.8. Let U U U\mathcal{U} be a set and let S = { S 1 , S 2 , , S m } S = S 1 , S 2 , , S m S={S_(1),S_(2),dots,S_(m)}\mathcal{S}=\left\{S_{1}, S_{2}, \ldots, S_{m}\right\} be a finite collection of subsets of U U U\mathcal{U}. Let N = { 1 , 2 , , m } N = { 1 , 2 , , m } N={1,2,dots,m}N=\{1,2, \ldots, m\}, and define f : 2 N R f : 2 N R f:2^(N)rarrRf: 2^{N} \rightarrow \mathbb{R} as: f ( A ) = | i A S i | f ( A ) = i A S i f(A)=|uu_(i in A)S_(i)|f(A)=\left|\cup_{i \in A} S_{i}\right| for A E A E A sube EA \subseteq E. Show that f f ff is a monotone non-negative submodular set function.
Exercise 2.9. Let G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) be a directed graph and let f : 2 V R f : 2 V R f:2^(V)rarrRf: 2^{V} \rightarrow \mathbb{R} where f ( S ) =∣ δ + ( S ) f ( S ) =∣ δ + ( S ) f(S)=∣delta^(+)(S)f(S)=\mid \delta^{+}(S) is the number of arcs leaving S S SS. Prove that f f ff is submodular. Is the function monotone?

2.5.1 Submodular Set Cover

When formulated in terms of submodular set functions, the Set Cover problem is the following. Given a monotone submodular function f f ff (whose value would be computed by an oracle) on N = { 1 , 2 , , m } N = { 1 , 2 , , m } N={1,2,dots,m}N=\{1,2, \ldots, m\}, find the smallest set S N S N S sube NS \subseteq N such that f ( S ) = f ( N ) f ( S ) = f ( N ) f(S)=f(N)f(S)=f(N). Our previous greedy approximation can be applied to this formulation as follows.
Greedr Submodular ( f , N ) _ Greedr Submodular  ( f , N ) _ "Greedr Submodular "(f,N)_\underline{\text{Greedr Submodular }(f, N)}
  1. S S S larr O/S \leftarrow \emptyset
  2. While f ( S ) f ( N ) f ( S ) f ( N ) f(S)!=f(N)f(S) \neq f(N) do
    • A.find i i ii to maximize f ( S + i ) f ( S ) f ( S + i ) f ( S ) f(S+i)-f(S)f(S+i)-f(S)
    • B. S S { i } S S { i } S larr S uu{i}S \leftarrow S \cup\{i\}
  3. Output S S SS
Not so easy exercise.
Exercise 2.10.
  1. Prove that the greedy algorithm is a 1 + ln ( f ( N ) ) 1 + ln ( f ( N ) ) 1+ln(f(N))1+\ln (f(N)) approximation for Submodular Set Cover?
  2. Prove that the greedy algorithm is a 1 + ln ( max i f ( i ) ) 1 + ln max i f ( i ) 1+ln(max_(i)f(i))1+\ln \left(\max _{i} f(i)\right) approximation for Submodular Set Cover.
The above results were first obtained by Wolsey [7].

2.5.2 Submodular Maximum Coverage

By formulating the Maximum Coverage problem in terms of submodular functions, we seek to maximize f ( S ) f ( S ) f(S)f(S) such that | S | k | S | k |S| <= k|S| \leq k. We can apply algorithm Greedy Submodular for this problem by changing the condition in line 2 to be: while | S | k | S | k |S| <= k|S| \leq k.
Exercise 2.11. Prove that greedy gives a ( 1 1 / e ) ( 1 1 / e ) (1-1//e)(1-1 / e)-approximation for Submodular Maximum Coverage problem when f f ff is monotone and non-negative. Hint: Generalize the main claim that we used for Мaximum Coverage.
The above and many related results were shown in the influential papers of Fisher, Nemhauser and Wolsey [8] [9].

2.6 Covering Integer Programs (CIPs)

There are several extensions of Set Cover that are interesting and useful. Submodular Set Cover is a very general problem while there are intermediate problems of interest such as Set Multicover. We refer to the reader to the relevant chapters in the two reference books. Here we refer to a general problem called Covering Integer Programs ( CIP CIP CIP\operatorname{CIP}s for short). The goal is to solve the following integer program where A R + n × m A R + n × m A inR_(+)^(n xx m)A \in \mathbb{R}_{+}^{n \times m} is a non-negative matrix. We can assume without loss of generality that w w ww and b b bb are also non-negative.
min j = 1 n w j x j subject to A x b x j d j 1 j m x j 0 1 j m x j Z 1 j m min j = 1 n w j x j subject to  A x b x j d j 1 j m x j 0 1 j m x j Z 1 j m {:[minsum_(j=1)^(n)w_(j)x_(j)],["subject to "],[Ax >= b],[x_(j) <= d_(j)quad1 <= j <= m],[x_(j) >= 0quad1 <= j <= m],[x_(j) inZquad1 <= j <= m]:}\begin{aligned} \min & \sum_{j=1}^{n} w_{j} x_{j} \\ \text{subject to } & \\ A x & \geq b \\ x_{j} & \leq d_{j} \quad 1 \leq j \leq m \\ x_{j} & \geq 0 \quad 1 \leq j \leq m \\ x_{j} & \in \mathbb{Z} \quad1 \leq j \leq m \end{aligned}
A x b A x b Ax >= bA x \geq b model covering constraints and x j d j x j d j x_(j) <= d_(j)x_{j} \leq d_{j} models multiplicity constraints. Note that Set Cover is a special case where A A AA is simply the incidence matrix of the sets and elements (the columns correspond to sets and the rows to elements) and d j = 1 d j = 1 d_(j)=1d_{j}=1 for all j j jj. What are CIP CIP CIP\operatorname{CIP}s modeling? It is a generalization of Set Cover. To see this, assume, without loss of generality, that A , b A , b A,bA, b are integer matrices. For each element corresponding to row i i ii the quantity b i b i b_(i)b_{i} corresponds to the requirement of how many times i i ii needs to be covered. A i j A i j A_(ij)A_{i j} corresponds to the number of times set S j S j S_(j)S_{j} covers element i i ii. d j d j d_(j)d_{j} is an upper bound on the number of copies of set S j S j S_(j)S_{j} that are allowed to be picked.
Exercise 2.12. Prove that CIP CIP CIP\operatorname{CIP}s are a special case of Submodular Set Cover.
One can apply the Greedy algorithm to the above problem and the standard analysis shows that the approximation ratio obtained is O ( log B ) O ( log B ) O(log B)O(\log B) where B = i b i B = i b i B=sum_(i)b_(i)B=\sum_{i} b_{i} (assuming that they are integers). Even though this is reasonable we would prefer a strongly polynomial bound. In fact there are instances where B B BB is exponential in n n nn and the worst-case approximation ratio can be poor. The natural LP LP LP\operatorname{LP} relaxation of the above integer program has a large integrality gap in constrat to the case of Set Cover. One needs to strengthen the LP LP LP\operatorname{LP} relaxation via what are known as knapsack cover inequalities. We refer the reader to the paper of Kolliopoulos and Young [10] and recent one by Chekuri and Quanrud [11] for more on this problem.

  1. Uriel Feige. “A threshold of ln n for approximating set cover”. In: Journal of the ACM (JACM) 45.4 (1998), pp. 634–652. ↩︎
  2. Dana Moshkovitz. “The Projection Games Conjecture and the NPHardness of ln n n nn-Approximating Set-Cover”. In: Theory of Computing 11.1 (2015), pp. 221–235. ↩︎
  3. George Karakostas. “A better approximation ratio for the vertex cover problem”. In: ACM Transactions on Algorithms (TALG) 5.4 (2009), p. 41. ↩︎
  4. Irit Dinur and Samuel Safra. “On the hardness of approximating minimum vertex cover”. In: Annals of mathematics (2005), pp. 439–485. ↩︎
  5. Subhash Khot and Oded Regev. “Vertex cover might be hard to approximate to within 2- ε ε epsi\varepsilon”. In: Journal of Computer and System Sciences 74.3 (2008), pp. 335–349. ↩︎
  6. Brenda S Baker. “Approximation algorithms for NP-complete problems on planar graphs”. In: Journal of the ACM (JACM) 41.1 (1994), pp. 153–180. ↩︎
  7. Laurence A Wolsey. “An analysis of the greedy algorithm for the submodular set covering problem”. In: Combinatorica 2.4 (1982), pp. 385– 393. ↩︎
  8. Marshall L Fisher, George L Nemhauser, and Laurence A Wolsey. “An analysis of approximations for maximizing submodular set functions II”. In: Polyhedral combinatorics (1978), pp. 73–87. ↩︎
  9. George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. “An analysis of approximations for maximizing submodular set functions—I”. In: Mathematical Programming 14.1 (1978), pp. 265–294. ↩︎
  10. Stavros G Kolliopoulos and Neal E Young. “Approximation algorithms for covering/packing integer programs”. In: Journal of Computer and System Sciences 71.4 (2005), pp. 495–505. ↩︎
  11. Chandra Chekuri and Kent Quanrud. “On approximating (sparse) covering integer programs”. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM. 2019, pp. 1596–1615. ↩︎

Recommended for you

Kaichao You
Co-Tuning: An easy but effective trick to improve transfer learning
Co-Tuning: An easy but effective trick to improve transfer learning
Transfer learning is a popular method in the deep learning community, but it is usually implemented naively (eg. copying weights as initialization). Co-Tuning is a recently proposed technique to improve transfer learning that is easy to implement, and effective to a wide variety of tasks.
5 points
0 issues