Approximation Algorithms: Steiner Forest Problem

This is the lecture notes from Chandra Chekuri's CS583 course on Approximation Algorithms. Chapter 11: Steiner Forest Problem.
You can read Chapter 12: Primal Dual for Constrained Forest Problems, here. Chapter 10: Introduction to Network Design, here.
Chapter 11
We discuss a primal-dual based 2-approximation for the Steiner Forest problem[1]. In the Steiner Forest problem there is a graph where each edge has a cost . We are given pairs of vertices . The goal is to find the minimum cost set of edges such that in the graph the vertices and are in the same connected component for . We refer to the set as terminals. Notice that the graph can contain multiple connected components.
One can see that Steiner Forest generalizes the Steiner Tree problem. It is, however, much less easy to obtain a constant factor approximation for Steiner Forest. A natural greedy heuristic, similar to that for the online Steiner Tree problem is the following. We order the the pairs in some arbitrary fashion, say as to without loss of generality. We maintain a forest of edges that we have already bought. When considering pair we find a shortest path from to in the graph in which we reduce the cost of each edge in to (since we already paid for them). We add the new edges in to . Although simple to describe, the algorithm is not straight forward to analyze. In fact the best bound we have on the performance of this algorithm is (note that the corresponding bound for Steiner Tree tree is ) while the lower bound on its peformance is only . The analysis of the upper bound requires appealing to certain extremal results [2] and closing the gap between the upper and lower bound on the analysis of this simplest of greedy algorithms is a very interesting open problem.
The first constant factor approximation for Steiner Forest is due to the influential work of Agarwal, Klein and Ravi [3] who gave a primal-dual 2-approximation which is still the best known. Their primal-dual algorithm has since been generalized for a wide variety of network design problems via the work of Goemans and Williamson (see their survey [4]) and several others. Steiner Forest has the advantage that one can visualize the algorithm and analysis more easily when compared to the more abstract settings that we will see shortly.
We now describe an integer programming formulation for the problem. In the IP we will have a variable for each edge such that is if and only if is part of the solution. Let the set be the collection of all sets such that for some . For a set let denote the set of edges crossing the cut . The IP can be written as the following.
We can obtain an LP-relaxation by changing the constraint that to . The dual of the LP-relaxation can be written as the following.
Before we continue, some definitions will be stated which will help to define our algorithm for the problem.
Definition 11.1. Given a set of edges , a set is violated with respect to if . In other words is violated even with edge set included.
Definition 11.2. Given a set of edges , a set is minimally violated with respect to if is violated with respect to and there is no that is also violated with respect to .
Next we show that any two minimally violated sets are disjoint.
Claim 11.0.1. if and are minimally violated sets then , i.e. and are disjoint.
In fact we will prove the following claim which implies the preceding one.
Claim 11.0.2. Let . The minimially violated sets with respect to are the connected components of the graph that are violated with respect to .
Proof. Consider a minimal violated set . We may assume the set contains but not for some . If is not a connected component of then there must be some connected component of that contains . But , and hence is violated; this contradicts the fact that is minimal. Therefore, if a set is a minimal violated set then it must be connected in the graph .
Now suppose that is a connected component of ; it is easy to see that no proper subset of can be violated since some edge will cross any such set. Thus, if is a violated set then it is minimal violated set.
Thus, the minimal violated sets with respect to are the conected components of the graph that are themselves violated sets. It follows that any two distinct minimal violated sets are disjoint.
The primal-dual algorithm for Steiner Forest is described below.
SteinerForest: |
The first thing to notice about the algorithm above is that it is closely related to our solution to the Vertex Cover problem, however, there are two main differences. In the Vertex Cover we raised the dual variables for all uncovered edges uniformly, however, in this algorithm we were more careful on which dual variables are raised. In this algorithm, we chose to only raise the variables which correspond to the minimally violated sets. Unlike the case of Steiner Tree, in Steiner Forest, there can be non-trivial connected components that are not violated and hence become inactive. A temporarily inactive component may become part of an active component later if an active component merges with it. The other main difference is that when we finally output the solution, we prune to get . This is done for technical reasons, but the intuition is that we should include no edge in the solution which is not needed to obtain a feasible solution. To understand this algorithm, there is a non-trivial example in the textbook [5] that demonstrates the algorithm's finer points.
Lemma 11.1. At the end of the algorithm, and are primal and dual feasible solutions, respectively.
Proof. In each iteration of the while loop, only the dual variables corresponding to connected components were raised. Therefore, no edge that is contained within the same component can become tight, and, therefore, is acyclic. To see that none of the dual constraints are violated, observe that when a constraint becomes tight (that is, it holds with equality), the corresponding edge is added to . Subsequently, since is contained in some connected component of , no set with ever has raised. Therefore, the constraint for cannot be violated, and so is dual feasible.
As long as is not feasible, the while loop will not terminate, and there are some minimal violated sets that can have their dual variables raised. Therefore, at the end of the algorithm is feasible. Moreover, since is acyclic (it is a forest), there is a unique path in for each . Thus, each edge on a path is not redundant and is not deleted when pruning to get .
Theorem 11.3. The primal-dual algorithm for Steiner Forest gives a -approximation.
Proof. Let be the output from our algorithm. To prove this theorem, we want to show that where is the feasible dual constructed by the algorithm. It follows from this that the algorithm is in fact a 2-approximation. First, we know that because every edge picked is tight. Let denote the number of edges of that cross the cut . It can be seen that .
Let contain the minimally violated sets in iteration and let denote the amount of dual growth in the th iteration. Say that our algorithm runs for iterations. We can then rewrite as the double summation . In the next lemma it will be shown for any iteration that . Knowing this we can prove the theorem:
Now we show the lemma used in the previous theorem. It is in this lemma that we use the fact that we prune to get .
We need a simple claim.
Lemma 11.2. Let be a tree/forest and let be a subset of the nodes such that every leaf of is in . Then where is the degree of in .
Proof. We will prove it for a tree. We have since a tree has edges. Every node has degree at least 2 since it is not a leaf. Thus . Thus,
Lemma 11.3. For any iteration of our algorithm, .
Proof. Consider the graph , and fix an iteration . In this graph, contract each set active in iteration to a single node (call such a node an active node), and each inactive set to a single node. Let the resulting graph be denoted by . We know that is a forest and we have contracted connected subsets of vertices in ; as , we conclude that is also a forest.
Claim 11.0.3. Every leaf of is an active node.
Proof. If not, consider leaf node of which is an inactive node and let be the edge incident to it. We claim that is feasible which would contradict the minimality of . To see this, if are two nodes in where then and are connected also in since is a leaf. Thus the utility of is to connect to other nodes in but if this is the case would be an active node at the start of the iteration which is not the case.
The degree in of an active node corresponding to violated set is . Now we apply Lemma 11.2.
- Parts of this chapter are based on past scribed lecture notes by Ben Moseley from 2009. ↩︎
- Baruch Awerbuch, Yossi Azar, and Yair Bartal. “On-line generalized Steiner problem”. In: Theoretical Computer Science 324.2-3 (2004), pp. 313–324. ↩︎
- Ajit Agrawal, Philip Klein, and Ramamoorthi Ravi. “When trees collide: An approximation algorithm for the generalized Steiner problem on networks”. In: SIAM journal on Computing 24.3 (1995), pp. 440–456. ↩︎
- Michel X Goemans and David P Williamson. “The primal-dual method for approximation algorithms and its application to network design problems”. In: Approximation algorithms for NP-hard problems (1997), pp. 144–191. ↩︎
- Vijay V Vazirani. Approximation algorithms. Springer Science & Business Media, 2013. ↩︎
Recommended for you
Co-Tuning: An easy but effective trick to improve transfer learning
Co-Tuning: An easy but effective trick to improve transfer learning
Transfer learning is a popular method in the deep learning community, but it is usually implemented naively (eg. copying weights as initialization). Co-Tuning is a recently proposed technique to improve transfer learning that is easy to implement, and effective to a wide variety of tasks.
