Approximation Algorithms: Packing Problems

This is the lecture notes from Chandra Chekuri's CS583 course on Approximation Algorithms. Chapter 4: Packing Problems. You can read Chapter 5: Load Balancing and Bin Packing, here. Chapter 3: Knapsack, here.
Chapter 4
In the previous lecture we discussed the Knapsack problem. In this lecture we discuss other packing and independent set problems. We first discuss an abstract model of packing problems. Let N N NN be a finite ground set. A collection of I 2 N I 2 N Isub2^(N)\mathcal{I} \subset 2^{N} of subsets of N N NN is said to be down closed if the following property is true: A I A I A inIA \in \mathcal{I} implies that for all B A , B I B A , B I B sub A,B inIB \subset A, B \in \mathcal{I}. A down closed collection is also often called and independence system. The sets in I I I\mathcal{I} are called independent sets. Given an independence family ( N , I ) ( N , I ) (N,I)(N, \mathcal{I}) and a non-negative weight function w : N R + w : N R + w:N rarrR^(+)w: N \rightarrow \mathbb{R}^{+} the maximum weight independent set problem is to find max S I w ( S ) max S I w ( S ) max_(S inI)w(S)\max _{S \in \mathcal{I}} w(S). That is, find an independent set in I I I\mathcal{I} of maximum weight. Often we may be interested in the setting where all weights are 1 1 11 in which case we wish to find the maximum cardinality independent set. We discuss some canonical examples.
Example 4.1. Independent sets in graphs: Given a graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E)
I = { S V there are no edges between nodes in S } I = { S V  there are no edges between nodes in  S } I={S sube V∣" there are no edges between nodes in "S}I=\{S \subseteq V \mid \text{ there are no edges between nodes in } S\}. Here the ground set is V V VV. There are many interesting special cases of the graph problem. For instance problems arising from geometric objects such as intervals, rectangles, disks and others.
Example 4.2. Matchings in graphs: Given a graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) let I = { M E M is a matching in G } I = { M E M  is a matching in  G } I={M sube E∣M" is a matching in "G}\mathcal{I}=\{M \subseteq E \mid M\text{ is a matching in }G\}. Here the ground set is E E EE.
Example 4.3. Matroids: A matroid M = ( N , I ) M = ( N , I ) M=(N,I)\mathcal{M}=(N, \mathcal{I}) is defined as a system where I I I\mathcal{I} is down closed and in addition satisfies the following key property: if A , B I A , B I A,B inIA, B \in \mathcal{I} and | B | > | A | | B | > | A | |B| > |A||B|>|A| then there is an element e B A e B A e in B\\Ae \in B \backslash A such that A { e } I A { e } I A uu{e}inIA \cup\{e\} \in \mathcal{I}. There are many examples of matroids. We will not go into details here.
Example 4.4. Intersections of independence systems: given some k k kk independence systems on the same ground set ( N , I 1 ) , ( N , I 2 ) , , ( N , I k ) N , I 1 , N , I 2 , , N , I k (N,I_(1)),(N,I_(2)),dots,(N,I_(k))\left(N, \mathcal{I}_{1}\right),\left(N, \mathcal{I}_{2}\right), \ldots,\left(N, \mathcal{I}_{k}\right) the system defined by ( N , I 1 I 2 I k ) N , I 1 I 2 I k (N,I_(1)nnI_(2)dots nnI_(k))\left(N, \mathcal{I}_{1} \cap I_{2} \ldots \cap \mathcal{I}_{k}\right) is also an independence system. Well-known examples include intersections of matroids.

4.1 Maximum Independent Set Problem in Graphs

A basic graph optimization problem with many applications is the maximum (weighted) independent set problem ( MIS MIS MIS\operatorname{MIS}) in graphs.
Definition 4.1. Given an undirected graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) a subset of nodes S V S V S sube VS \subseteq V is an independent set (stable set) iff there is no edge in E E EE between any two nodes in S S SS. A subset of nodes S S SS is a clique if every pair of nodes in S S SS have an edge between them in G G GG.
The MIS MIS MIS\operatorname{MIS} problem is the following: given a graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) find an independent set in G G GG of maximum cardinality. In the weighted case, each node v V v V v in Vv \in V has an associated non-negative weight w ( v ) w ( v ) w(v)w(v) and the goal is to find a maximum weight independent set. This problem is NP NP NP\operatorname{NP}-Hard and it is natural to ask for approximation algorithms. Unfortunately, as the famous theorem below shows, the problem is extremely hard to approximate.
Theorem 4.2 (Håstad [1]). Unless P = N P P = N P P=NPP=N P there is no 1 n 1 ϵ 1 n 1 ϵ (1)/(n^(1-epsilon))\frac{1}{n^{1-\epsilon}}-approximation for MIS MIS MIS\operatorname{MIS} for any fixed ϵ > 0 ϵ > 0 epsilon > 0\epsilon>0 where n n nn is the number of nodes in the given graph.
Remark 4.1. The maximum clique problem is to find the maximum cardinality clique in a given graph. It is approximation-equivalent to the MIS MIS MIS\operatorname{MIS} problem; simply complement the graph.
The theorem basically says the following: there are a class of graphs in which the maximum independent set size is either less than n δ n δ n^(delta)n^{\delta} or greater than n 1 δ n 1 δ n^(1-delta)n^{1-\delta} and it is NP NP NP\operatorname{NP}-Complete to decide whether a given graph falls into the former category or the latter.
The lower bound result suggests that one should focus on special cases, and several interesting positive results are known. First, we consider a simple greedy algorithm for the unweighted problem.
Greedy ( G ) _ Greedy ( G ) _ "Greedy"(G)_\underline{\text{Greedy}(G)}
  1. S S S larr O/S \leftarrow \emptyset
  2. While G G GG is not empty do
    • A.Let v v vv be a node of minimum degree in G G GG
    • B. S S { v } S S { v } S larr S uu{v}S \leftarrow S \cup\{v\}
    • C.Remove v v vv and its neighbors from G G GG
  3. Output S S SS
Theorem 4.3. Greedy outputs an independent set S S SS such that | S | n / ( Δ + 1 ) | S | n / ( Δ + 1 ) |S| >= n//(Delta+1)|S| \geq n /(\Delta+1) where Δ Δ Delta\Delta is the maximum degree of any node in the graph. Moreover | S | α ( G ) / Δ | S | α ( G ) / Δ |S| >= alpha(G)//Delta|S| \geq \alpha(G) / \Delta where α ( G ) α ( G ) alpha(G)\alpha(G) is the cardinality of the largest independent set. Thus Greedy is a 1 / Δ 1 / Δ 1//Delta1/\Delta approximation.
Proof. We upper bound the number of nodes in V S V S V\\SV \backslash S as follows. A node u u uu is in V S V S V\\SV \backslash S because it is removed as a neighbor of some node v S v S v in Sv \in S when Greedy added v v vv to S S SS. Charge u u uu to v v vv. A node v S v S v in Sv \in S can be charged at most Δ Δ Delta\Delta times since it has at most Δ Δ Delta\Delta neighbors. Hence we have that | V S | Δ | S | | V S | Δ | S | |V\\S| <= Delta|S||V \backslash S| \leq \Delta|S|. Since every node is either in S S SS or V S V S V\\SV \backslash S we have | S | + | V S | = n | S | + | V S | = n |S|+|V\\S|=n|S|+|V \backslash S|=n and therefore ( Δ + 1 ) | S | n ( Δ + 1 ) | S | n (Delta+1)|S| >= n(\Delta+1)|S| \geq n which implies that | S | n / ( Δ + 1 ) | S | n / ( Δ + 1 ) |S| >= n//(Delta+1)|S| \geq n /(\Delta+1).
We now argue that | S | α ( G ) / Δ | S | α ( G ) / Δ |S| >= alpha(G)//Delta|S| \geq \alpha(G) / \Delta. Let S S S^(**)S^{*} be a largest independent set in G G GG. As in the above proof we can charge each node v v vv in S S S S S^(**)\\SS^{*} \backslash S to a node u S S u S S u in S\\S^(**)u \in S \backslash S^{*} which is a neighbor of v v vv. The number of nodes charged to a node u S S u S S u in S\\S^(**)u \in S \backslash S^{*} is at most Δ Δ Delta\Delta. Thus | S S | Δ | S S | S S Δ S S |S^(**)\\S| <= Delta|S\\S^(**)|\left|S^{*} \backslash S\right| \leq \Delta\left|S \backslash S^{*}\right|.
Exercise 4.1. Show that Greedy outputs an independent set of size at least n 2 ( d + 1 ) n 2 ( d + 1 ) (n)/(2(d+1))\frac{n}{2(d+1)} where d d dd is the average degree of G G GG.
Remark 4.2. The well-known Turan's theorem shows via a clever argument that there is always an independent set of size n ( d + 1 ) n ( d + 1 ) (n)/((d+1))\frac{n}{(d+1)} where d d dd is the average degree of G G GG.
Remark 4.3. For the case of unweighted graphs one can obtain an approximation ratio of Ω ( log d d log log d ) Ω log d d log log d Omega((log d)/(d log log d))\Omega\left(\frac{\log d}{d \log \log d}\right) where d d dd is the average degree. Surprisingly, under a complexity theory conjecture called the Unique-Games conjecture it is known to be NP NP NP\operatorname{NP}-Hard to approximate MIS MIS MIS\operatorname{MIS} to within a factor of O ( log 2 Δ Δ ) O log 2 Δ Δ O((log^(2)Delta)/(Delta))O\left(\frac{\log ^{2} \Delta}{\Delta}\right) in graphs with maximum degree Δ Δ Delta\Delta when Δ Δ Delta\Delta is sufficiently large.
Exercise 4.2. Consider the weigthed MIS MIS MIS\operatorname{MIS} problem on graphs of maximum degree Δ Δ Delta\Delta. Alter Greedy to sort the nodes in non-increasing order of the weight and show that it gives a 1 Δ 1 Δ (1)/(Delta)\frac{1}{\Delta}-approximation. Can one obtain an Ω ( 1 / d ) Ω ( 1 / d ) Omega(1//d)\Omega(1 / d)-approximation for the weighted case where d d dd is the average degree?
LP Relaxation: One can formulate a simple linear-programming relaxation for the (weighted) MIS MIS MIS\operatorname{MIS} problem where we have a variable x ( v ) x ( v ) x(v)x(v) for each node v V v V v in Vv \in V indicating whether v v vv is chosen in the independent set or not. We have constraints which state that for each edge ( u , v ) ( u , v ) (u,v)(u, v) only one of u u uu or v v vv can be chosen.
maximize v V w ( v ) x ( v ) subject to x ( u ) + x ( v ) 1 ( u , v ) E x ( v ) [ 0 , 1 ] v V maximize v V w ( v ) x ( v ) subject to  x ( u ) + x ( v ) 1 ( u , v ) E x ( v ) [ 0 , 1 ] v V {:["maximize"sum_(v in V)w(v)x(v)],["subject to "x(u)+x(v) <= 1quad(u","v)in E],[x(v) in[0","1]quad v in V]:}\begin{aligned} \text {maximize} \sum_{v \in V} w(v) x(v)& \\ \text {subject to } x(u)+x(v) &\leq 1 \quad(u, v) \in E \\ x(v) &\in[0,1] \quad v \in V \end{aligned}
Although the above is a valid integer programming relaxation of MIS MIS MIS\operatorname{MIS} when the variabels are constrained to be in { 0 , 1 } { 0 , 1 } {0,1}\{0,1\}, it is not a particularly useful formulation for the following simple reason.
Claim 4.1.1. For any graph the optimum value of the above LP relaxation is at least w ( V ) / 2 w ( V ) / 2 w(V)//2w(V) / 2. In particular, for the unweighted case it is at least n / 2 n / 2 n//2n / 2.
Simply set each x ( v ) x ( v ) x(v)x(v) to 1 / 2 1 / 2 1//21 / 2!
One can obtain a strengthened formulation below by observing that if S S SS is clique in G G GG then any independent set can pick at most one node from S S SS.
maximize v V w ( v ) x ( v ) subject to v S x ( v ) 1 S is a clique in G x ( v ) [ 0 , 1 ] v V maximize v V w ( v ) x ( v ) subject to v S x ( v ) 1 S  is a clique in  G x ( v ) [ 0 , 1 ] v V {:[maximize sum_(v in V)w(v)x(v)],["subject to"sum_(v in S)x(v) <= 1quad S" is a clique in "G],[x(v) in[0","1]quad v in V]:}\begin{aligned} \operatorname{maximize} \sum_{v \in V} w(v) x(v)& \\ \text {subject to} \sum_{v \in S} x(v) &\leq 1 \quad S \text { is a clique in } G \\ x(v) &\in[0,1] \quad v \in V \end{aligned}
The above linear program has an exponential number of constraints, and it cannot be solved in polynomial time in general, but for some special cases of interest the above linear program can indeed be solved (or approximately solved) in polynomial time and leads to either exact algorithms or good approximation bounds.
Approximability of Vertex Cover and MIS MIS MIS\operatorname{MIS}: The following is a basic fact and is easy to prove.
Fact 4.1. In any graph G = ( V , E ) , S G = ( V , E ) , S G=(V,E),SG=(V, E), S is a vertex cover in G G GG if and only if V S V S V\\SV \backslash S is an independent set in G G GG. Thus α ( G ) + β ( G ) = | V | α ( G ) + β ( G ) = | V | alpha(G)+beta(G)=|V|\alpha(G)+\beta(G)=|V| where α ( G ) α ( G ) alpha(G)\alpha(G) is the size of a maximum independent set in G G GG and β ( G ) β ( G ) beta(G)\beta(G) is the size of a minimum vertex cover in G G GG.
The above shows that if one of Vertex Cover or MIS MIS MIS\operatorname{MIS} is NP NP NP\operatorname{NP}-Hard then the other is as well. We have seen that Vertex Cover admits a 2 2 22-approximation while MIS MIS MIS\operatorname{MIS} admits no constant factor approximation. It is useful to see why a 2 2 22-approximation for Vertex Cover does not give any useful information for MIS MIS MIS\operatorname{MIS} even though α ( G ) + β ( G ) = | V | α ( G ) + β ( G ) = | V | alpha(G)+beta(G)=|V|\alpha(G)+\beta(G)=|V|. Suppose S S S^(**)S^{*} is an optimal vertex cover and has size | V | / 2 | V | / 2 >= |V|//2\geq|V| / 2. Then a 2 2 22-approximation algorithm is only guaranteed to give a vertex cover of size | V | | V | |V||V|! Hence one does not obtain a non-trivial independent set by complementing the approximate vertex cover.
Some special cases of MIS MIS MIS\operatorname{MIS}: We mention some special cases of MIS MIS MIS\operatorname{MIS} that have been considered in the literature, this is by no means an exhaustive list.
  • Interval graphs; these are intersection graphs of intervals on a line. An exact algorithm can be obtained via dynamic programming and one can solve more general versions via linear programming methods.
  • Note that a maximum (weight) matching in a graph G G GG can be viewed as a maximum (weight) independent set in the line-graph of G G GG and can be solved exactly in polynomial time. This has been extended to what are known as claw-free graphs.
  • Planar graphs and generalizations to bounded-genus graphs, and graphs that exclude a fixed minor. For such graphs one can obtain a PTAS PTAS PTAS\operatorname{PTAS} due to ideas originally from Brenda Baker.
  • Geometric intersection graphs. For example, given n n nn disks on the plane find a maximum number of disks that do not overlap. One could consider other (convex) shapes such as axis parallel rectangles, line segments, pseudo-disks etc. A number of results are known. For example a PTAS PTAS PTAS\operatorname{PTAS} is known for disks in the plane. An Ω ( 1 log n ) Ω 1 log n Omega((1)/(log n))\Omega\left(\frac{1}{\log n}\right)-approximation for axis-parallel rectangles in the plane when the rectangles are weighted and an Ω ( 1 log log n ) Ω 1 log log n Omega((1)/(log log n))-\Omega\left(\frac{1}{\log \log n}\right)- approximation for the unweighted case. For the unweighted case, very recently, Mitchell obtained a constant factor approximation!

4.1.1 Elimination Orders and MIS

We have seen that a simple Greedy algorithm gives a Δ Δ Delta\Delta-approximation for MIS MIS MIS\operatorname{MIS} in graphs with max degree Δ Δ Delta\Delta. One can also get a Δ Δ Delta\Delta approximation for a larger class of Δ Δ Delta\Delta-degenerate graphs. To motivate degenerate graphs consider the class of planar graphs. The maximum degree of a planar graph need not be small. Nevertheless, via Euler's theorem, we know that every planar graph has a vertex of degree at most 5 5 55 since the maximum number of edges in a planar graph is at most 3 n 6 3 n 6 3n-63 n-6. Moreover, every subgraph of a planar graph is planar, and hence the Greedy algorithm will repeatedly find a vertex of degree at most 5 in each iteration. From this one can show that Greedy gives a 1 / 5 1 / 5 1//51/5-approximation for MIS MIS MIS\operatorname{MIS} in planar graphs. Now consider the intersection graph of a collection of intervals on the real line. That is, we are given n n nn intervals I 1 , I 2 , , I n I 1 , I 2 , , I n I_(1),I_(2),dots,I_(n)I_{1}, I_{2}, \ldots, I_{n} where each I i = [ a i , b i ] I i = a i , b i I_(i)=[a_(i),b_(i)]I_{i}=\left[a_{i}, b_{i}\right] for real numbers a i b i a i b i a_(i) <= b_(i)a_{i} \leq b_{i}. The goal is to find a maximum number of the intervals in the given set of intervals which do not overlap. This is the same as finding MIS MIS MIS\operatorname{MIS} in the intersection graph of the intervals - the graph is obtained by creating a vertex v i v i v_(i)v_{i} for each I i I i I_(i)I_{i}, and by adding edges v i v j v i v j v_(i)v_(j)v_{i} v_{j} if I i I i I_(i)I_{i} and I j I j I_(j)I_{j} overlap. It is well-known that greedily picking intervals in earliest finish time order (ordering them according to b i b i b_(i)b_{i} values) is optimal; the reader should try to prove this. Can one understand the analysis of all these examples in a unified fashion? Yes. For this purpose we consider the class of inductive k k kk-independent graphs considered by by Akcoglu et al.[2] and later again by Ye and Borodin[3].
For a vertex v v vv in a graph we use N ( v ) N ( v ) N(v)N(v) denote the neighbors of v v vv (not including v v vv itself). For a graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) and S V S V S sub VS \subset V we use G [ S ] G [ S ] G[S]G[S] to denote the subgraph of G G GG induced by S S SS.
Definition 4.4. An undirected graph G = ( V , E ) G = ( V , E ) G=(V,E)G=(V, E) is inductive k k kk-independent if there is an ordering of the vertices v 1 , v 2 , , v n v 1 , v 2 , , v n v_(1),v_(2),dots,v_(n)v_{1}, v_{2}, \ldots, v_{n} such that for 1 i n , α ( G [ N ( v i ) 1 i n , α G N v i 1 <= i <= n,alpha(G[N(v_(i))nn:}1 \leq i \leq n, \alpha\left(G\left[N\left(v_{i}\right) \cap\right.\right. { v i + 1 , , v n } ] ) k v i + 1 , , v n k {:{v_(i+1),dots,v_(n)}]) <= k\left.\left.\left\{v_{i+1}, \ldots, v_{n}\right\}\right]\right) \leq k.
Graphs which are inductively 1 1 11-independent have a perfect elimination ordering and are called chordal graphs because they have an alternate characterization. A graph is chordal iff each cycle C C CC in G G GG has a chord (an edge connecting two nodes of C C CC which is not an edge of C C CC), or in other words there is no induced cycle of length more than 3 3 33.
Exercise 4.3. Prove that the intersection graph of intervals is chordal.
Exercise 4.4. Prove that if Δ ( G ) k Δ ( G ) k Delta(G) <= k\Delta(G) \leq k then G G GG is inductively k k kk-independent. Prove that if G G GG is k k kk-degenerate then G G GG is inductively k k kk-independent.
The preceding shows that planar graphs are inductively 5 5 55-independent. In fact, one can show something stronger, that they are inductively 3 3 33-independent. Given a graph G G GG one can ask whether there is an algorithm that checks whether G G GG is inductively k k kk-independent. There is such an algorithm that runs in time O ( k 2 n k + 2 ) O k 2 n k + 2 O(k^(2)n^(k+2))O\left(k^{2} n^{k+2}\right)[3:1]. A classical result shows how to recognize chordal graphs ( k = 1 ) ( k = 1 ) (k=1)(k=1) in linear time. However, most of the useful applications arise by showing that a certain class of graphs are inductively k k kk-independent for some small value of k k kk. See [3:2] for several examples.
Exercise 4.5. Prove that the Greedy algorithm that considers the vertices in the inductive k k kk-independent order gives a 1 k 1 k (1)/(k)\frac{1}{k}-approximation for MIS MIS MIS\operatorname{MIS}.
Interestingly one can obtain a 1 k 1 k (1)/(k)\frac{1}{k}-approximation for the maximum weight independent set problem in inductively k k kk-independent graphs. The algorithm is simple and runs in linear time but is not obvious. To see this consider the weighted problem for intervals. The standard algorithm to solve this is via dynamic programming. However, one can obtain an optimum solution for all chordal graphs (given the ordering). We refer the reader to [3:3] for the algorithm and proof (originally from [2:1]). Showing a Ω ( 1 / k ) Ω ( 1 / k ) Omega(1//k)\Omega(1 / k)-approximation is easier.

4.2 The efficacy of the Greedy algorithm for a class of Independence Families

The Greedy algorithm can be defined easily for an arbitrary independence system. It iteratively adds the best element to the current independent set while maintaining feasibility. Note that the implementation of the algorithm requires having an oracle to find the best element to add to a current independent set S S SS.
Greedy ( N , I ) _ Greedy ( N , I ) _ "Greedy"(N,I)_\underline{\text{Greedy}(N, \mathcal{I})}
  1. S S S larr O/S \leftarrow \emptyset
  2. While (TRUE)
    • A.Let A { e N S S + e I } A { e N S S + e I } A larr{e in N\\S∣S+e inI}A \leftarrow\{e \in N \backslash S \mid S+e \in \mathcal{I}\}
    • B.If A = A = A=O/A=\emptyset break
    • C. e argmax e A w ( e ) e argmax e A w ( e ) e larrargmax_(e in A)w(e)e \leftarrow \operatorname{argmax}_{e \in A} w(e)
    • D. S S { e } S S { e } S larr S uu{e}S \leftarrow S \cup\{e\}
  3. Output S S SS
Exercise 4.6. Prove that the Greedy algorithm gives a 1 / 2 1 / 2 1//21/2-approximation for the maximum weight matching problem in a general graph. Also prove that this bound is tight even in bipartite graphs. Note that max weight matching can be solved exactly in polynomial time.
Remark 4.4. It is well-known that the Greedy algorithm gives an optimum solution when ( N , I ) ( N , I ) (N,I)(N, \mathcal{I}) is a matroid. Kruskal's algorithm for min/max weight spanning tree is a special case of this fact.
It is easy to see that Greedy does poorly for MIS MIS MIS\operatorname{MIS} problem in general graphs. A natural question is what properties of I I I\mathcal{I} enable some reasonable performance guarantee for Greedy. A very general result in this context has been established due to Jenkyn's generalizing several previous results. In order to state the result we set up some notation. Given an independence system ( N , I ) ( N , I ) (N,I)(N, \mathcal{I}) we say that a set A I A I A inIA \in \mathcal{I} is a base if it is a maximal independent set. It is well-known that in a matroid M M M\mathcal{M} all bases have the same cardinality. However this is not true in general independence system.
Definition 4.5. An independence system ( N , I ) ( N , I ) (N,I)(N, \mathcal{I}) is a k k kk-system if for any two bases A , B I , | A | k | B | A , B I , | A | k | B | A,B inI,|A| <= k|B|A, B \in \mathcal{I},|A| \leq k|B|. That is, the ratio of the cardinality of a maximum base and the cardinality of a minimum base is at most k k kk.
The following theorem is not too difficult but not so obvious either.
Theorem 4.6. Greedy gives a 1 / k 1 / k 1//k1 / k-approximation for the maximum weight independent set problem in a k k kk-system.
The above theorem generalizes and unifies several examples that we have seen so far including MIS MIS MIS\operatorname{MIS} in bounded degree graphs, matchings, matroids etc. How does one see that a given independence system is indeed a k k kk-system for some parameter k k kk? For instance matchings in graphs form a 2 2 22-system. The following simple lemma gives an easy way to argue that a given system is a k k kk-system.
Lemma 4.1. Suppose ( N , I ) ( N , I ) (N,I)(N, \mathcal{I}) is an independence system with the following property: for any A I A I A inIA \in \mathcal{I} and e N A e N A e in N\\Ae \in N \backslash A there is a set Y A Y A Y sub AY \subset A such that | Y | k | Y | k |Y| <= k|Y| \leq k and ( A Y ) { e } I ( A Y ) { e } I (A\\Y)uu{e}inI(A \backslash Y) \cup\{e\} \in \mathcal{I}. Then I I I\mathcal{I} is a k k kk-system.
We leave the proof of the above as an exercise.
We refer the reader to [4],[5] for analysis of Greedy in k k kk-systems and other special cases.

4.3 Randomized Rounding with Alteration for Packing Problems

The purpose of this section to highlight a technique for rounding LP LP LP\operatorname{LP} relaxations for packing problems. We will consider a simple example, namely the maximum weight independent set problem in interval graphs. Recall that we are given n n nn intervals I 1 , I 2 , , I n I 1 , I 2 , , I n I_(1),I_(2),dots,I_(n)I_{1}, I_{2}, \ldots, I_{n} with non-negative weights w 1 , , w n w 1 , , w n w_(1),dots,w_(n)w_{1}, \ldots, w_{n} and the goal is to find a maximum weight subset of them which do not overlap. Let I i = [ a i , b i ] I i = a i , b i I_(i)=[a_(i),b_(i)]I_{i}=\left[a_{i}, b_{i}\right] and let p 1 , p 2 , , p m p 1 , p 2 , , p m p_(1),p_(2),dots,p_(m)p_{1}, p_{2}, \ldots, p_{m} be the collection of end points of the intervals. We can write a simple LP LP LP\operatorname{LP} relaxation for this problem. For each interval i i ii we have a variable x i [ 0 , 1 ] x i [ 0 , 1 ] x_(i)in[0,1]x_{i} \in[0,1] to indicate whether I i I i I_(i)I_{i} is chosen or not. For each point p j p j p_(j)p_{j}, among all intervals that contain it, at most one can be chosen. These are clique constraints in the underlying interval graph.
maximize i = 1 n w i x i subject to i : p j I i x i 1 1 j m x i [ 0 , 1 ] 1 i n maximize i = 1 n w i x i  subject to  i : p j I i x i 1 1 j m x i [ 0 , 1 ] 1 i n {:[maximize sum_(i=1)^(n)w_(i)x_(i)],[" subject to "sum_(i:p_(j)inI_(i))x_(i) <= 1quad1 <= j <= m],[x_(i) in[0","1]quad1 <= i <= n]:}\begin{aligned} \operatorname{maximize} \sum_{i=1}^{n} w_{i} x_{i}& \\ \text { subject to } \sum_{i: p_{j} \in I_{i}} x_{i} &\leq 1 \quad 1 \leq j \leq m \\ x_{i} &\in[0,1] \quad 1 \leq i \leq n \end{aligned}
Note that it is important to retain the constraint that x i 1 x i 1 x_(i) <= 1x_{i} \leq 1. Interestingly it is known that the LP LP LP\operatorname{LP} relaxation defines an integer polytope and hence one can solve the integer program by solving the LP LP LP\operatorname{LP} relaxation! This is because the incidence matrix defining the LP LP LP\operatorname{LP} is totally unimodular ( TUM TUM TUM\operatorname{TUM}). We refer the reader to books on combinatorial optimization for further background on this topic. Here we assume that we do not know the integer properties of the LP LP LP\operatorname{LP}. We will round it via a technique that is powerful and generalizes to NP NP NP\operatorname{NP}-Hard variants of the interval scheduling problem among many others.
Suppose we solve the LP LP LP\operatorname{LP} and obtain an optimum fraction solution x x x^(**)x^{*}. We have i w i x i OPT i w i x i OPT sum_(i)w_(i)x_(i)^(**) >= OPT\sum_{i} w_{i} x_{i}^{*} \geq \operatorname{OPT}. How do we round to obtain an integer solution whose value is close to that of OPT OPT OPT\operatorname{OPT}? Suppose we randomly choose I i I i I_(i)I_{i} with probablility c x i c x i cx_(i)^(**)c x_{i}^{*} for some c 1 c 1 c <= 1c \leq 1. Let R R RR be the random set of chosen intervals. Then the expected weight of R R RR, by linearity of expectation, is c i w i x i c OPT c i w i x i c OPT csum_(i)w_(i)x_(i)^(**) >= c*OPTc \sum_{i} w_{i} x_{i}^{*} \geq c \cdot \operatorname{OPT}. However, it is highly likely that the random solution R R RR is not going to be feasible. Some constraint will be violated. The question is how we can fix or alter R R RR to find a subset R R R R R^(')sube RR^{\prime} \subseteq R such that R R R^(')R^{\prime} is a feasible solution and the expected value of R R R^(')R^{\prime} is not too much smaller than that of R R RR. This depends on the independence structure.
Here we illustrate this via the interval problem. Without loss of generality we assume that I 1 , , I n I 1 , , I n I_(1),dots,I_(n)I_{1}, \ldots, I_{n} are sorted by their right end point. In other words the order is a perfect elimination order for the underlying interval graph.
Rounding-With-Alteration _ Rounding-With-Alteration _ "Rounding-With-Alteration"_\underline{\text{Rounding-With-Alteration}}
  1. Let x x xx be an optimum fractional solution
  2. Round each i i ii to 1 1 11 independently with probability x i / 2 x i / 2 x_(i)//2x_{i} / 2. Let x x x^(')x^{\prime} be rounded solution.
  3. R { i x i = 1 } R i x i = 1 R larr{i∣x_(i)^(')=1}R \leftarrow\left\{i \mid x_{i}^{\prime}=1\right\}
  4. S S S larr O/S \leftarrow \emptyset
  5. For i = n i = n i=ni=n down to 1 1 11 do
    • A.If ( i R ) ( i R ) (i in R)(i \in R) and ( S { i } S { i } S uu{i}S \cup\{i\} is feasible) then S S { i } S S { i } S larr S uu{i}S \leftarrow S \cup\{i\}
  6. Output feasible solution S S SS
The algorithm consists of two phases. The first phase is a simple selection phase via independent randomized rounding. The second phase is deterministic and is a greedy pruning step in the reverse elimination order. To analyze the expected value of S S SS we consider two binary random variables for each i , Y i i , Y i i,Y_(i)i, Y_{i} and Z i . Y i Z i . Y i Z_(i).Y_(i)Z_{i} . Y_{i} is 1 1 11 if i R i R i in Ri \in R and 0 0 00 otherwise. Z i Z i Z_(i)Z_{i} is 1 1 11 if i S i S i in Si \in S and 0 0 00 otherwise.
By linearity of expectation,
Claim 4.3.1. E [ w ( S ) ] = i w i E [ Z i ] = i w i P [ Z i = 1 ] E [ w ( S ) ] = i w i E Z i = i w i P Z i = 1 E[w(S)]=sum_(i)w_(i)E[Z_(i)]=sum_(i)w_(i)P[Z_(i)=1]\mathbb{E}[w(S)]=\sum_{i} w_{i} \mathbb{E}\left[Z_{i}\right]=\sum_{i} w_{i} \mathbf{P}\left[Z_{i}=1\right].
Via the independent randomized rounding in the algorithm.
Claim 4.3.2. P [ Y i = 1 ] = x i / 2 P Y i = 1 = x i / 2 P[Y_(i)=1]=x_(i)//2\mathbf{P}\left[Y_{i}=1\right]=x_{i} / 2.
How do we analyze P [ Z i = 1 ] P Z i = 1 P[Z_(i)=1]\mathbf{P}\left[Z_{i}=1\right]? The random variables Z 1 , , Z n Z 1 , , Z n Z_(1),dots,Z_(n)Z_{1}, \ldots, Z_{n} are not independent and could be highly correlated even though Y 1 , , Y n Y 1 , , Y n Y_(1),dots,Y_(n)Y_{1}, \ldots, Y_{n} are independent. For this purpose we try to understand P [ Z i = 0 Y i = 1 ] P Z i = 0 Y i = 1 P[Z_(i)=0∣Y_(i)=1]\mathbf{P}\left[Z_{i}=0 \mid Y_{i}=1\right] which is the conditional probability that an interval I i I i I_(i)I_{i} that is chosen in the first step is rejected in the pruning phase. We often would not be able to get an exact estimate of this quantity but we can upper bound it as follows. Here the ordering plays a crucial role. Why would I i I i I_(i)I_{i} be rejected in the pruning phase? Note that when I i I i I_(i)I_{i} is considered in the pruning phase, the only intervals that have been considered have their right end points after the right end point of I i I i I_(i)I_{i}. Let A i = { j j > i A i = j j > i A_(i)={j∣j > i:}A_{i}=\left\{j \mid j>i\right. and I j I j I_(j)I_{j} and I i I i I_(i)I_{i} intersect at b i } b i {:b_(i)}\left.b_{i}\right\} be the potential set of intervals that can cause i i ii to be rejected. Recall that the LP LP LP\operatorname{LP} implies the following constraint:
x i + j A x j 1 x i + j A x j 1 x_(i)+sum_(j in A)x_(j) <= 1x_{i}+\sum_{j \in A} x_{j} \leq 1
at the point b j b j b_(j)b_{j}. Let E 1 E 1 E_(1)\mathcal{E}_{1} be the event that I i I i I_(i)I_{i} is rejected in the pruning phase. Let E E E_(in)\mathcal{E}_{\in} be the event that at least one of the intervals in A A AA is selected in the first phase. Note that E 1 E 1 E_(1)\mathcal{E}_{1} can happen only if E 2 E 2 E_(2)\mathcal{E}_{2} happens. Thus P [ E 1 ] P [ E 2 ] P E 1 P E 2 P[E_(1)] <= P[E_(2)]\mathbf{P}\left[\mathcal{E}_{1}\right] \leq \mathbf{P}\left[\mathcal{E}_{2}\right]. In general we try to upper bound P [ E 2 ] P E 2 P[E_(2)]\mathbf{P}\left[\mathcal{E}_{2}\right]. In this simple case we have an exact formula for it.
P [ E 2 ] = 1 j A P [ Y j = 0 ] = 1 j A ( 1 x j / 2 ) . P E 2 = 1 j A P Y j = 0 = 1 j A 1 x j / 2 . P[E_(2)]=1-prod_(j in A)P[Y_(j)=0]=1-prod_(j in A)(1-x_(j)//2).\mathbf{P}\left[\mathcal{E}_{2}\right]=1-\prod_{j \in A} \mathbf{P}\left[Y_{j}=0\right]=1-\prod_{j \in A}\left(1-x_{j} / 2\right) .
We claim that P [ E 2 ] j A x j / 2 1 / 2 P E 2 j A x j / 2 1 / 2 P[E_(2)] <= sum_(j in A)x_(j)//2 <= 1//2\mathbf{P}\left[\mathcal{E}_{2}\right] \leq \sum_{j \in A} x_{j} / 2 \leq 1 / 2. One can derive this by showing that j A ( 1 x j / 2 ) j A 1 x j / 2 prod_(j in A)(1-x_(j)//2)\prod_{j \in A}\left(1-x_{j} / 2\right) subject to j A x j / 2 1 / 2 j A x j / 2 1 / 2 sum_(j in A)x_(j)//2 <= 1//2\sum_{j \in A} x_{j} / 2 \leq 1 / 2 is at least 1 / 2 1 / 2 1//21 / 2. Another way of doing this is via Markov's inequality. Let T = j A Y j T = j A Y j T=sum_(j in A)Y_(j)T=\sum_{j \in A} Y_{j} be the number of intervals from A A AA selected in the first phase. E [ T ] j A x j / 2 < 1 / 2 E [ T ] j A x j / 2 < 1 / 2 E[T] <= sum_(j in A)x_(j)//2 < 1//2E[T] \leq \sum_{j \in A} x_{j} / 2<1 / 2. By Markov's inequality P [ T 2 E [ T ] ] 1 / 2 P [ T 2 E [ T ] ] 1 / 2 P[T >= 2E[T]] <= 1//2\mathbf{P}[T \geq 2 E[T]] \leq 1 / 2. E 2 E 2 E_(2)\mathcal{E}_{2} is the event that P [ T 1 ] P [ T 1 ] P[T >= 1]\mathbf{P}[T \geq 1].
Using the claim,
P [ Z i = 1 Y i = 1 ] = 1 P [ Z i = 0 Y i = 1 ] 1 / 2 . P Z i = 1 Y i = 1 = 1 P Z i = 0 Y i = 1 1 / 2 . P[Z_(i)=1∣Y_(i)=1]=1-P[Z_(i)=0∣Y_(i)=1] >= 1//2.\mathbf{P}\left[Z_{i}=1 \mid Y_{i}=1\right]=1-\mathbf{P}\left[Z_{i}=0 \mid Y_{i}=1\right] \geq 1 / 2 .
This allows us to lower bound the expected weight of the solution output by the algorithm, and yields a randomized 1 / 4 1 / 4 1//41 / 4 approximation.
Claim 4.3.3. E [ w ( S ) ] i w i x i / 4 E [ w ( S ) ] i w i x i / 4 E[w(S)] >= sum_(i)w_(i)x_(i)//4\mathbb{E}[w(S)] \geq \sum_{i} w_{i} x_{i} / 4.
Proof. We have
E [ w ( S ) ] = i w i P [ Z i = 1 ] = i w i P [ Y i = 1 ] P [ Z i = 1 Y i = 1 ] i w i ( x i 4 1 2 ) i w i x i / 4 . E [ w ( S ) ] = i w i P Z i = 1 = i w i P Y i = 1 P Z i = 1 Y i = 1 i w i x i 4 1 2 i w i x i / 4 . E[w(S)]=sum_(i)w_(i)P[Z_(i)=1]=sum_(i)w_(i)P[Y_(i)=1]P[Z_(i)=1∣Y_(i)=1] >= sum_(i)w_(i)((x_(i))/(4)*(1)/(2)) >= sum_(i)w_(i)x_(i)//4.\mathbb{E}[w(S)]=\sum_{i} w_{i} \mathbf{P}\left[Z_{i}=1\right]=\sum_{i} w_{i} \mathbf{P}\left[Y_{i}=1\right] \mathbf{P}\left[Z_{i}=1 \mid Y_{i}=1\right] \geq \sum_{i} w_{i}\left(\frac{x_{i}}{4} \cdot \frac{1}{2}\right) \geq \sum_{i} w_{i} x_{i} / 4 .
This type of rounding has applications to a variety of settings - see [ C V Z ] [ C V Z ] [CVZ]\mathbf{[CVZ]} for applications and the general framework called contention resolution schemes.

4.4 Packing Integer Programs (PIPs)

We can express the Knapsack problem as the following integer program. We scaled the knapsack capacity to 1 1 11 without loss of generality.
maximize i = 1 n p i x i subject to i s i x i 1 x i { 0 , 1 } 1 i n maximize  i = 1 n p i x i subject to  i s i x i 1 x i { 0 , 1 } 1 i n {:["maximize "sum_(i=1)^(n)p_(i)x_(i)],["subject to "sum_(i)s_(i)x_(i) <= 1],[x_(i) in{0","1}quad1 <= i <= n]:}\begin{aligned} \text {maximize } \sum_{i=1}^{n} p_{i} x_{i}& \\ \text {subject to } \sum_{i} s_{i} x_{i} &\leq 1 \\ x_{i} &\in\{0,1\} \quad 1 \leq i \leq n \end{aligned}
More generally if have multiple linear constraints on the "items" we obtain the following integer program.
Definition 4.7. A packing integer program ( PIP PIP PIP\operatorname{PIP}) is an integer program of the form max { w x A x 1 , x { 0 , 1 } n } max w x A x 1 , x { 0 , 1 } n max{wx∣Ax <= 1,x in{0,1}^(n)}\max \left\{w x \mid A x \leq 1, x \in\{0,1\}^{n}\right\} where w w ww is a 1 × n 1 × n 1xx n1 \times n non-negative vector and A A AA is a m × n m × n m xx nm \times n matrix with entries in [ 0 , 1 ] [ 0 , 1 ] [0,1][0,1]. We call it a { 0 , 1 } PIP { 0 , 1 } PIP {0,1}-PIP\{0,1\}-\operatorname{PIP} if all entries are in { 0 , 1 } { 0 , 1 } {0,1}\{0,1\}.
In some cases it is useful/natural to define the problem as max { w x A x max { w x A x max{wx∣Ax <=\max \{w x \mid A x \leq b , x { 0 , 1 } n } b , x { 0 , 1 } n {:b,x in{0,1}^(n)}\left.b, x \in\{0,1\}^{n}\right\} where entries in A A AA and b b bb are required to rational/integer valued. We can convert it into the above form by dividing each row of A A AA by b i b i b_(i)b_{i}.
When m m mm the number of rows of A A AA (equivalently the constraints) is small the problem is tractable. It is some times called the m m mm-dimensional knapsack and one can obtain a PTAS PTAS PTAS\operatorname{PTAS} for any fixed constant m m mm. However, when m m mm is large we observe that MIS MIS MIS\operatorname{MIS} can be cast as a special case of { 0 , 1 } PIP { 0 , 1 } PIP {0,1}-PIP\{0,1\}-\operatorname{PIP}. It corresponds exactly to the simple integer/linear program that we saw in the previous section. Therefore the problem is at least as hard to approximate as MIS MIS MIS\operatorname{MIS}. Here we show via a clever LP LP LP\operatorname{LP}-rounding idea that one can generalize the notion of bounded-degree to column-sparsity in PIP PIP PIP\operatorname{PIP}s and obtain a related approximation. We will then introduce the notion of width of the constraints and show how it allows for improved bounds.
Definition 4.8. A PIP is k k kk-column-sparse if the number of non-zero entries in each column of A A AA is at most k k kk. A PIP has width W W WW if max i , j A i j / b i 1 / W max i , j A i j / b i 1 / W max_(i,j)A_(ij)//b_(i) <= 1//W\max _{i, j} A_{i j} / b_{i} \leq 1 /W.

4.4.1 Randomized Rounding with Alteration for PIPs

We saw that randomized rounding gave an O ( log n ) O ( log n ) O(log n)O(\log n) approximation algorithm for the Set Cover problem which is a canonical covering problem. Here we will consider the use of randomized rounding for packing problems. Let x x xx be an optimum fractional solution to the natural LP LP LP\operatorname{LP} relaxation of a PIP PIP PIP\operatorname{PIP} where we replace the constraint x { 0 , 1 } n x { 0 , 1 } n x in{0,1}^(n)x \in\{0,1\}^{n} by x [ 0 , 1 ] n x [ 0 , 1 ] n x in[0,1]^(n)x \in[0,1]^{n}. Suppose we apply independent randomized rounding where we set x i x i x_(i)^(')x_{i}^{\prime} to 1 1 11 with probability x i x i x_(i)x_{i}. Let x x x^(')x^{\prime} be the resulting integer solution. The expected weight of this solution is exactly i w i x i i w i x i sum_(i)w_(i)x_(i)\sum_{i} w_{i} x_{i} which is the LP LP LP\operatorname{LP} solution value. However, x x x^(')x^{\prime} may not satisfy the constraints given by A x b A x b Ax <= bA x \leq b. A natural strategy to try to satisfy the constraints is to set x 1 x 1 x_(1)^(')x_{1}^{\prime} to 1 1 11 with probability c x i c x i cx_(i)c x_{i} where c < 1 c < 1 c < 1c<1 is some scaling constant. This may help in satisfying the constraints because the scaling creates some room in the constraints; we now have that the expected solution value is c i w i x i c i w i x i csum_(i)w_(i)x_(i)c \sum_{i} w_{i} x_{i}, a loss of a factor of c c cc. Scaling by itself does not allow us to claim that all constraints are satisfied with good probability. A very useful technique in this context is the technique of alteration; we judiciously fix/alter the rounded solution x x x^(')x^{\prime} to force it to satisfy the constraints by setting some of the variables that are 1 1 11 in x x x^(')x^{\prime} to 0 0 00. The trick is to do this in such a way as to have a handle on the final probability that a variable is set to 1 1 11. We will illustrate this for the Knapsack problem and then generalize the idea to k k kk-sparse PIP PIP PIP\operatorname{PIP}s. The algorithms we present are from[6]. See [ C V Z ] [ C V Z ] [CVZ][\mathbf{CVZ}] for further applications and related problems.
Rounding for Knapsack: Consider the Knapsack problem. It is convenient to think of this in the context of PIP PIP PIP\operatorname{PIP}s. So we have a x 1 a x 1 ax <= 1a x \leq 1 where a i a i a_(i)a_{i} now represents the size of item i i ii and the knapsack capacity is 1 ; w i 1 ; w i 1;w_(i)1 ; w_{i} is the weight of item. Suppose x x xx is a fractional solution. Call an item i i ii "big" if a i > 1 / 2 a i > 1 / 2 a_(i) > 1//2a_{i}>1 / 2 and otherwise it is "small". Let S S SS be the indices of small items and B B BB the indices of the big items. Consider the following rounding algorithm.
Rounding-With-Alteration For Knapsack _ Rounding-With-Alteration For Knapsack _ "Rounding-With-Alteration For Knapsack"_\underline{\text{Rounding-With-Alteration For Knapsack}}
  1. Let x x xx be an optimum fractional solution
  2. Round each i i ii to 1 1 11 independently with probability x i / 4 x i / 4 x_(i)//4x_{i} / 4. Let x x x^(')x^{\prime} be rounded solution.
  3. x = x x = x x^('')=x^(')x^{\prime \prime}=x^{\prime}
  4. If ( x i = 1 x i = 1 (x_(i)^(')=1:}\left(x_{i}^{\prime}=1\right. for exactly one big item i ) i {:i)\left.i\right)
    • A.For each j i j i j!=ij \neq i set x j = 0 x j = 0 x_(j)^('')=0x_{j}^{\prime \prime}=0
  5. Else If ( i S s i x i > 1 i S s i x i > 1 (sum_(i in S)s_(i)x_(i)^(') > 1:}\left(\sum_{i \in S} s_{i} x_{i}^{\prime}>1\right. or two or more big items are chosen in x ) x {:x^('))\left.x^{\prime}\right)
    • A.For each j j jj set x j = 0 x j = 0 x_(j)^('')=0x_{j}^{\prime \prime}=0
  6. Output feasible solution x x x^('')x^{\prime \prime}
In words, the algorithm alters the rounded solution x x x^(')x^{\prime} as follows. If exactly one big item is chosen in x x x^(')x^{\prime} then the algorithm retains that item and rejects all the other small items. Otherwise, the algorithm rejects all items if two or more big items are chosen in x x x^(')x^{\prime} or if the total size of all small items chosen in x x x^(')x^{\prime} exceeds the capacity.
The following claim is easy to verify.
Claim 4.4.1. The integer solution x x x^('')x^{\prime \prime} is feasible.
Now let us analyze the probability of an item i i ii being present in the final solution. Let E 1 E 1 E_(1)\mathcal{E}_{1} be the event that i S a i x i > 1 i S a i x i > 1 sum_(i in S)a_(i)x_(i)^(') > 1\sum_{i \in S} a_{i} x_{i}^{\prime}>1, that is the sum of the sizes of the small items chose in x x x^(')x^{\prime} exceeds the capacity. Let E 2 E 2 E_(2)\mathcal{E}_{2} be the event that at least one big item is chosen in x x x^(')x^{\prime}.
Claim 4.4.2. P [ E 1 ] 1 / 4 P E 1 1 / 4 P[E_(1)] <= 1//4\mathbf{P}\left[\boldsymbol{\mathcal { E }}_{1}\right] \leq 1 / 4.
Proof. Let X s = i S a i x i X s = i S a i x i X_(s)=sum_(i in S)a_(i)x_(i)^(')X_{s}=\sum_{i \in S} a_{i} x_{i}^{\prime} be the random variable that measures the sum of the sizes of the small items chosen. We have, by linearity of expectation, that
E [ X s ] = i S a i E [ x i ] = i S a i x i / 4 1 / 4 . E X s = i S a i E x i = i S a i x i / 4 1 / 4 . E[X_(s)]=sum_(i in S)a_(i)E[x_(i)^(')]=sum_(i in S)a_(i)x_(i)//4 <= 1//4.\mathbb{E}\left[X_{s}\right]=\sum_{i \in S} a_{i} \mathbb{E}\left[x_{i}^{\prime}\right]=\sum_{i \in S} a_{i} x_{i} / 4 \leq 1 / 4 .
By Markov's inequality, P [ X s > 1 ] E [ X s ] / 1 1 / 4 P X s > 1 E X s / 1 1 / 4 P[X_(s) > 1] <= E[X_(s)]//1 <= 1//4\mathbf{P}\left[X_{s}>1\right] \leq \mathbb{E}\left[X_{s}\right] / 1 \leq 1 / 4.
Claim 4.4.3. P [ E 2 ] 1 / 2 P E 2 1 / 2 P[E_(2)] <= 1//2\mathbf{P}\left[\mathcal{E}_{2}\right] \leq 1 / 2.
Proof. Since the size of each big item in B B BB is at least 1 / 2 1 / 2 1//21 / 2, we have 1 i B a i x i 1 i B a i x i 1 >= sum_(i in B)a_(i)x_(i) >=1 \geq \sum_{i \in B} a_{i} x_{i} \geq i B x i / 2 i B x i / 2 sum_(i in B)x_(i)//2\sum_{i \in B} x_{i} / 2. Therefore i B x i / 4 1 / 2 i B x i / 4 1 / 2 sum_(i in B)x_(i)//4 <= 1//2\sum_{i \in B} x_{i} / 4 \leq 1 / 2. Event E 2 E 2 E_(2)\mathcal{E}_{2} happens if some item i B i B i in Bi \in B is chosen in the random selection. Since i i ii is chosen with probability x i / 4 x i / 4 x_(i)//4x_{i} / 4, by the union bound, P [ E 2 ] i B x i / 4 1 / 2 P E 2 i B x i / 4 1 / 2 P[E_(2)] <= sum_(i in B)x_(i)//4 <= 1//2\mathbf{P}\left[\mathcal{E}_{2}\right] \leq \sum_{i \in B} x_{i} / 4 \leq 1 / 2.
Lemma 4.2. Let Z i Z i Z_(i)Z_{i} be the indicator random variable that is 1 1 11 if x i = 1 x i = 1 x_(i)^('')=1x_{i}^{\prime \prime}=1 and 0 0 00 otherwise. Then E [ Z i ] = P [ Z i = 1 ] x i / 16 E Z i = P Z i = 1 x i / 16 E[Z_(i)]=P[Z_(i)=1] >= x_(i)//16\mathbb{E}\left[Z_{i}\right]=\mathbf{P}\left[Z_{i}=1\right] \geq x_{i} / 16.
Proof. We consider the binary random variable X i X i X_(i)X_{i} which is 1 1 11 if x i = 1 x i = 1 x_(i)^(')=1x_{i}^{\prime}=1. We have E [ X i ] = P [ X i = 1 ] = x i / 4 E X i = P X i = 1 = x i / 4 E[X_(i)]=P[X_(i)=1]=x_(i)//4\mathbb{E}\left[X_{i}\right]=\mathbf{P}\left[X_{i}=1\right]=x_{i} / 4. We write
P [ Z i = 1 ] = P [ X i = 1 ] P [ Z i = 1 X i = 1 ] = x i 4 P [ Z i = 1 X i = 1 ] . P Z i = 1 = P X i = 1 P Z i = 1 X i = 1 = x i 4 P Z i = 1 X i = 1 . P[Z_(i)=1]=P[X_(i)=1]*P[Z_(i)=1∣X_(i)=1]=(x_(i))/(4)P[Z_(i)=1∣X_(i)=1].\mathbf{P}\left[Z_{i}=1\right]=\mathbf{P}\left[X_{i}=1\right] \cdot \mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right]=\frac{x_{i}}{4} \mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right] .
To lower bound P [ Z i = 1 X i = 1 ] P Z i = 1 X i = 1 P[Z_(i)=1∣X_(i)=1]\mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right] we upper bound the probability P [ Z i = P Z i = P[Z_(i)=:}\mathbf{P}\left[Z_{i}=\right. 0 X i = 1 ] 0 X i = 1 {:0∣X_(i)=1]\left.0 \mid X_{i}=1\right], that is, the probability that we reject i i ii conditioned on the fact that it is chosen in the random solution x x x^(')x^{\prime}.
First consider a big item i i ii that is chosen in x x x^(')x^{\prime}. Then i i ii is rejected iff if another big item is chosen in x x x^(')x^{\prime}; the probability of this can be upper bounded by P [ E 1 ] P E 1 P[E_(1)]\mathbf{P}\left[\mathcal{E}_{1}\right]. If item i i ii is small then it is rejected if and only if E 2 E 2 E_(2)\mathcal{E}_{2} happens or if a big item is chosen which happens with P [ E 1 ] P E 1 P[E_(1)]\mathbf{P}\left[\mathcal{E}_{1}\right]. In either case
P [ Z i = 0 X i = 1 ] P [ E 1 ] + P [ E 2 ] 1 / 4 + 1 / 2 = 3 / 4 . P Z i = 0 X i = 1 P E 1 + P E 2 1 / 4 + 1 / 2 = 3 / 4 . P[Z_(i)=0∣X_(i)=1] <= P[E_(1)]+P[E_(2)] <= 1//4+1//2=3//4.\mathbf{P}\left[Z_{i}=0 \mid X_{i}=1\right] \leq \mathbf{P}\left[\mathcal{E}_{1}\right]+\mathbf{P}\left[\mathcal{E}_{2}\right] \leq 1 / 4+1 / 2=3 / 4 .
Thus,
P [ Z i = 1 ] = P [ X i = 1 ] P [ Z i = 1 X i = 1 ] = x i 4 ( 1 P [ Z i = 0 X i = 1 ] ) x i 16 . P Z i = 1 = P X i = 1 P Z i = 1 X i = 1 = x i 4 1 P Z i = 0 X i = 1 x i 16 . P[Z_(i)=1]=P[X_(i)=1]*P[Z_(i)=1∣X_(i)=1]=(x_(i))/(4)(1-P[Z_(i)=0∣X_(i)=1]) >= (x_(i))/(16).\mathbf{P}\left[Z_{i}=1\right]=\mathbf{P}\left[X_{i}=1\right] \cdot \mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right]=\frac{x_{i}}{4}\left(1-\mathbf{P}\left[Z_{i}=0 \mid X_{i}=1\right]\right) \geq \frac{x_{i}}{16} .
One can improve the above analysis to show that P [ Z i = 1 ] x i / 8 P Z i = 1 x i / 8 P[Z_(i)=1] >= x_(i)//8\mathbf{P}\left[Z_{i}=1\right] \geq x_{i} / 8.
Theorem 4.9. The randomized algorithm outputs a feasible solution of expected weight at least i = 1 n w i x i / 16 i = 1 n w i x i / 16 sum_(i=1)^(n)w_(i)x_(i)//16\sum_{i=1}^{n} w_{i} x_{i} / 16.
Proof. The expected weight of the output is
E [ i w i x i ] = i w i E [ Z i ] i w i x i / 16 E i w i x i = i w i E Z i i w i x i / 16 E[sum_(i)w_(i)x_(i)^('')]=sum_(i)w_(i)E[Z_(i)] >= sum_(i)w_(i)x_(i)//16\mathbb{E}\left[\sum_{i} w_{i} x_{i}^{\prime \prime}\right]=\sum_{i} w_{i} \mathbb{E}\left[Z_{i}\right] \geq \sum_{i} w_{i} x_{i} / 16
where we used the previous lemma to lower bound E [ Z i ] E Z i E[Z_(i)]\mathbb{E}\left[Z_{i}\right].
Rounding for k k k\boldsymbol{k}-sparse PIPs: We now extend the rounding algorithm and analysis above to k k kk-sparse PIPs. Let x x xx be a feasible fractional solution to max { w x A x 1 , x [ 0 , 1 ] n } max w x A x 1 , x [ 0 , 1 ] n max{wx∣Ax <= 1,x in[0,1]^(n)}\max \left\{w x \mid A x \leq 1, x \in[0,1]^{n}\right\}. For a column index i i ii we let N ( i ) = { j A j , i > 0 } N ( i ) = j A j , i > 0 N(i)={j∣A_(j,i) > 0}N(i)=\left\{j \mid A_{j, i}>0\right\} be the indices of the rows in which i i ii has a non-zero entry. Since A A AA is k k kk column-sparse we have that | N ( i ) | k | N ( i ) | k |N(i)| <= k|N(i)| \leq k for 1 i n 1 i n 1 <= i <= n1 \leq i \leq n. When we have more than one constraint we cannot classify an item/index i i ii as big or small since it may be big for some constraints and small for others. We say that i i ii is small for constraint j N ( i ) j N ( i ) j in N(i)j \in N(i) if A j , i 1 / 2 A j , i 1 / 2 A_(j,i) <= 1//2A_{j, i} \leq 1 / 2 otherwise i i ii is big for constraint j j jj. Let S j = { i j N ( i ) S j = { i j N ( i ) S_(j)={i∣j in N(i)S_{j}=\{i \mid j \in N(i), and i i ii small for j } j } j}j\} be the set of all small columns for j j jj and B j = { i j N ( i ) B j = { i j N ( i ) B_(j)={i∣j in N(i)B_{j}=\{i \mid j \in N(i), and i i ii small for j } j } j}j\} be the set of all big columns for j j jj. Note that S j B j S j B j S_(j)nnB_(j)S_{j} \cap B_{j} is the set of all i i ii with A j , i > 0 A j , i > 0 A_(j,i) > 0A_{j, i}>0.
ROUNDING-WITH-ALTERATION FOR k -SPARSE PIPs _ ROUNDING-WITH-ALTERATION FOR  k -SPARSE PIPs _ "ROUNDING-WITH-ALTERATION FOR "k"-SPARSE PIPs"_\underline{\text{ROUNDING-WITH-ALTERATION FOR } k \text{-SPARSE PIPs}}
  1. Let x x xx be an optimum fractional solution
  2. Round each i i ii to 1 independently with probability x i / ( 4 k ) x i / ( 4 k ) x_(i)//(4k)x_{i} /(4 k). Let x x x^(')x^{\prime} be rounded solution.
  3. x = x x = x x^('')=x^(')x^{\prime \prime}=x^{\prime}
  4. For j = 1 j = 1 j=1j=1 to m m mm do
    • A.If ( x i = 1 x i = 1 (x_(i)^(')=1:}\left(x_{i}^{\prime}=1\right. for exactly one i B j ) i B j {:i inB_(j))\left.i \in B_{j}\right) 1. For each h S j B j h S j B j h inS_(j)uuB_(j)h \in S_{j} \cup B_{j} and h i h i h!=ih \neq i set x h = 0 x h = 0 x_(h)^('')=0x_{h}^{\prime \prime}=0
    • B.Else If ( i S j A j , i x i > 1 i S j A j , i x i > 1 (sum_(i inS_(j))A_(j,i)x_(i)^(') > 1:}\left(\sum_{i \in S_{j}} A_{j, i} x_{i}^{\prime}>1\right. or two or more items from B j B j B_(j)B_{j} are chosen in x ) x {:x^('))\left.x^{\prime}\right) 1. For each h S j B j h S j B j h inS_(j)uuB_(j)h \in S_{j} \cup B_{j} set x h = 0 x h = 0 x_(h)^('')=0x_{h}^{\prime \prime}=0
  5. Output feasible solution x x x^('')x^{\prime \prime}
The algorithm, after picking the random solution x x x^(')x^{\prime}, alters it as follows: it applies the previous algorithm's strategy to each constraint j j jj separately. Thus an element i i ii can be rejected at different constraints j N ( i ) j N ( i ) j in N(i)j \in N(i). We need to bound the total probability of rejection. As before, the following claim is easy to verify.
Claim 4.4.4. The integer solution x x x^('')x^{\prime \prime} is feasible.
Now let us analyze the probability of an item i i ii being present in the final solution. Let E 1 ( j ) E 1 ( j ) E_(1)(j)\mathcal{E}_{1}(j) be the event that i S j A j , i x i > 1 i S j A j , i x i > 1 sum_(i inS_(j))A_(j,i)x_(i)^(') > 1\sum_{i \in S_{j}} A_{j, i} x_{i}^{\prime}>1, that is the sum of the sizes of the items that are small for j j jj in x x x^(')x^{\prime} exceed the capacity. Let E 2 ( j ) E 2 ( j ) E_(2)(j)\mathcal{E}_{2}(j) be the event that at least one big item for j j jj is chosen in x x x^(')x^{\prime}. The following claims follow from the same reasoning as the ones before with the only change being the scaling factor.
Claim 4.4.5. P [ E 1 ( j ) ] 1 / ( 4 k ) P E 1 ( j ) 1 / ( 4 k ) P[E_(1)(j)] <= 1//(4k)\mathbf{P}\left[\mathcal{E}_{1}(j)\right] \leq 1 /(4 k).
Claim 4.4.6. P [ E 2 ( j ) ] 1 / ( 2 k ) P E 2 ( j ) 1 / ( 2 k ) P[E_(2)(j)] <= 1//(2k)\mathbf{P}\left[\mathcal{E}_{2}(j)\right] \leq 1 /(2 k).
Lemma 4.3. Let Z i Z i Z_(i)Z_{i} be the indicator random variable that is 1 1 11 if x i = 1 x i = 1 x_(i)^('')=1x_{i}^{\prime \prime}=1 and 0 0 00 otherwise. Then E [ Z i ] = P [ Z i = 1 ] x i / ( 16 k ) E Z i = P Z i = 1 x i / ( 16 k ) E[Z_(i)]=P[Z_(i)=1] >= x_(i)//(16 k)\mathbb{E}\left[Z_{i}\right]=\mathbf{P}\left[Z_{i}=1\right] \geq x_{i} /(16 k).
Proof. We consider the binary random variable X i X i X_(i)X_{i} which is 1 1 11 if x i = 1 x i = 1 x_(i)^(')=1x_{i}^{\prime}=1 after the randomized rounding. We have E [ X i ] = P [ X i = 1 ] = x i / ( 4 k ) E X i = P X i = 1 = x i / ( 4 k ) E[X_(i)]=P[X_(i)=1]=x_(i)//(4k)\mathbb{E}\left[X_{i}\right]=\mathbf{P}\left[X_{i}=1\right]=x_{i} /(4 k). We write
P [ Z i = 1 ] = P [ X i = 1 ] P [ Z i = 1 X i = 1 ] = x i 4 k P [ Z i = 1 X i = 1 ] . P Z i = 1 = P X i = 1 P Z i = 1 X i = 1 = x i 4 k P Z i = 1 X i = 1 . P[Z_(i)=1]=P[X_(i)=1]*P[Z_(i)=1∣X_(i)=1]=(x_(i))/(4k)P[Z_(i)=1∣X_(i)=1].\mathbf{P}\left[Z_{i}=1\right]=\mathbf{P}\left[X_{i}=1\right] \cdot \mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right]=\frac{x_{i}}{4 k} \mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right].
We upper bound the probability P [ Z i = 0 X i = 1 ] P Z i = 0 X i = 1 P[Z_(i)=0∣X_(i)=1]\mathbf{P}\left[Z_{i}=0 \mid X_{i}=1\right], that is, the probability that we reject i i ii conditioned on the fact that it is chosen in the random solution x x x^(')x^{\prime}. We observe that
P [ Z i = 0 X i = 1 ] j N ( i ) ( P [ E 1 ( j ) ] + P [ E 2 ( j ) ] k ( 1 / ( 4 k ) + 1 / ( 2 k ) ) 3 / 4 . P Z i = 0 X i = 1 j N ( i ) P E 1 ( j ) + P E 2 ( j ) k ( 1 / ( 4 k ) + 1 / ( 2 k ) ) 3 / 4 . P[Z_(i)=0∣X_(i)=1] <= sum_(j in N(i))(P[E_(1)(j)]+P[E_(2)(j)] <= k(1//(4k)+1//(2k)) <= 3//4.:}\mathbf{P}\left[Z_{i}=0 \mid X_{i}=1\right] \leq \sum_{j \in N(i)}\left(\mathbf{P}\left[\mathcal{E}_{1}(j)\right]+\mathbf{P}\left[\mathcal{E}_{2}(j)\right] \leq k(1 /(4 k)+1 /(2 k)) \leq 3 / 4 .\right.
We used the fact that N ( i ) k N ( i ) k N(i) <= kN(i) \leq k and the claims above. Therefore,
P [ Z i = 1 ] = P [ X i = 1 ] P [ Z i = 1 X i = 1 ] = x i 4 k ( 1 P [ Z i = 0 X i = 1 ] ) x i 16 k . P Z i = 1 = P X i = 1 P Z i = 1 X i = 1 = x i 4 k 1 P Z i = 0 X i = 1 x i 16 k . P[Z_(i)=1]=P[X_(i)=1]*P[Z_(i)=1∣X_(i)=1]=(x_(i))/(4k)(1-P[Z_(i)=0∣X_(i)=1]) >= (x_(i))/(16 k).\mathbf{P}\left[Z_{i}=1\right]=\mathbf{P}\left[X_{i}=1\right] \cdot \mathbf{P}\left[Z_{i}=1 \mid X_{i}=1\right]=\frac{x_{i}}{4 k}\left(1-\mathbf{P}\left[Z_{i}=0 \mid X_{i}=1\right]\right) \geq \frac{x_{i}}{16 k} .
The theorem below follows by using the above lemma and linearity of expectation to compare the expected weight of the output of the randomized algorithm with that of the fractional solution.
Theorem 4.10. The randomized algorithm outputs a feasible solution of expected weight at least i = 1 n w i x i / ( 16 k ) i = 1 n w i x i / ( 16 k ) sum_(i=1)^(n)w_(i)x_(i)//(16 k)\sum_{i=1}^{n} w_{i} x_{i} /(16 k). There is 1 / ( 16 k ) 1 / ( 16 k ) 1//(16 k)1 /(16 k)-approximation for k k kk-sparse PIPs.
Larger width helps: We saw during the discussion on the Knapsack problem that if all items are small with respect to the capacity constraint then one can obtain better approximations. For PIP PIP PIP\operatorname{PIP}s we defined the width of a given instance as W W WW if max i , j A i j / b i 1 / W max i , j A i j / b i 1 / W max_(i,j)A_(ij)//b_(i) <= 1//W\max _{i, j} A_{i j} / b_{i} \leq 1 / W; in other words no single item is more than 1 / W 1 / W 1//W1 / W times the capacity of any constraint. One can show using a very similar algorithm and anaylisis as above that the approximation bound improves to Ω ( 1 / k [ W ] ) Ω 1 / k [ W ] Omega(1//k^([W]))\Omega\left(1 / k^{[W]}\right) for instance with width W W WW. Thus if W = 2 W = 2 W=2W=2 we get a Ω ( 1 / k ) Ω ( 1 / k ) Omega(1//sqrtk)\Omega(1 / \sqrt{k}) approximation instead of Ω ( 1 / k ) Ω ( 1 / k ) Omega(1//k)\Omega(1 / k)-approximation. More generally when W c log k / ϵ W c log k / ϵ W >= c log k//epsilonW \geq c \log k / \epsilon for some sufficiently large constant c c cc we can get a ( 1 ϵ ) ( 1 ϵ ) (1-epsilon)(1-\epsilon)-approximation. Thus, in the setting with multiple knapsack constraints, the notion of small with respect to capacities is that in each constraint the size of the item is c ϵ log k c ϵ log k <= (c epsilon)/(log k)\leq \frac{c \epsilon}{\log k} times the capacity of that constraint.

  1. Johan Hastad. “Clique is hard to approximate within n/sup 1-/spl epsiv”. In: Proceedings of 37th Conference on Foundations of Computer Science. IEEE. 1996, pp. 627–636. ↩︎
  2. Karhan Akcoglu, James Aspnes, Bhaskar DasGupta, and Ming-Yang Kao. “Opportunity cost algorithms for combinatorial auctions”. In: Computational Methods in Decision-Making, Economics and Finance. Springer, 2002, pp. 455–479. ↩︎ ↩︎
  3. Yuli Ye and Allan Borodin. “Elimination graphs”. In: ACM Transactions on Algorithms (TALG) 8.2 (2012), pp. 1–23. ↩︎ ↩︎ ↩︎ ↩︎
  4. Moran Feldman, Joseph Seffi Naor, Roy Schwartz, and Justin Ward. “Improved approximations for k-exchange systems”. In: European Symposium on Algorithms. Springer. 2011, pp. 784–798. ↩︎
  5. Julián Mestre. “Greedy in approximation algorithms”. In: European Symposium on Algorithms. Springer. 2006, pp. 528–539. ↩︎
  6. Nikhil Bansal, Nitish Korula, Viswanath Nagarajan, and Aravind Srinivasan. “Solving packing integer programs via randomized rounding with alterations”. In: Theory of Computing 8.1 (2012), pp. 533–565. ↩︎

Recommended for you

Juntao Jiang
Group Equivariant Convolutional Networks in Medical Image Analysis
Group Equivariant Convolutional Networks in Medical Image Analysis
This is a brief review of G-CNNs' applications in medical image analysis, including fundamental knowledge of group equivariant convolutional networks, and applications in medical images' classification and segmentation.
9 points
0 issues