Approximation Algorithms: Introduction

These are the lecture notes from Chandra Chekuri's CS583 course on Approximation Algorithms. Chapter 1: Introduction. You can read Chapter 2: Covering problems here.

Course Objectives

To appreciate that not all intractable problems are the same. $N P$ optimization problems, identical in terms of exact solvability, can appear very different from the approximation point of view. This sheds light on why, in practice, some optimization problems (such as Кnapsack) are easy, while others (like Clique) are extremely difficult.
To learn techniques for design and analysis of approximation algorithms, via some fundamental problems.
To build a toolkit of broadly applicable algorithms/heuristics that can be used to solve a variety of problems.
To understand reductions between optimization problems, and to develop the ability to relate new problems to known ones.

The complexity class

P

contains the set of problems that can be solved in polynomial time. From a theoretical viewpoint, this describes the class of tractable problems, that is, problems that can be solved efficiently. The class

N P

is the set of problems that can be solved in non-deterministic polynomial time, or equivalently, problems for which a solution can be verified in polynomial time.

N P

contains many interesting problems that often arise in practice, but there is good reason to believe

P \neq N P

. That is, it is unlikely that there exist algorithms to solve NP optimization problems efficiently, and so we often resort to heuristic methods to solve these problems.

Heuristic approaches include backtrack search and its variants, mathematical programming methods, local seach, genetic algorithms, tabu search, simulated annealing etc. Some methods are guaranteed to find an optimal solution, though they may take exponential time; others are guaranteed to run in polynomial time, though they may not return a (optimal) solution. Approximation algorithms are (typically) polynomial time heuristics that do not always find an optimal solution but they are distinguished from general heuristics in providing guarantees on the quality of the solution they output.

Approximation Ratio: To give a guarantee on solution quality, one must first define what we mean by the quality of a solution. We discuss this more carefully later. For now, note that each instance of an optimization problem has a set of feasible solutions. The optimization problems we consider have an objective function which assigns a (real/rational) number/value to each feasible solution of each instance

I

. The goal is to find a feasible solution with minimum objective function value or maximum objective function value. The former problems are minimization problems and the latter are maximization problems.

For each instance

I

of a problem, let

OPT (I)

denote the value of an optimal solution to instance

I

. We say that an algorithm

A

is an

α

-approximation algorithm for a problem if, for every instance

I

, the value of the feasible solution returned by

A

is within a (multiplicative) factor of

α

OPT (I)

. Equivalently, we say that

A

is an approximation algorithm with approximation ratio

α

. For a minimization problem we would have

α \geq 1

and for a maximization problem we would have

α \leq 1

. However, it is not uncommon to find in the literature a different convention for maximization problems where one says that

A

is an

α

-approximation algorithm if the value of the feasible solution returned by

A

is at least

\frac{1}{α} \cdot OPT (I)

; the reason for using convention is so that approximation ratios for both minimization and maximization problems will be

\geq 1

. In this course we will for the most part use the convention that

α \geq 1

for minimization problems and

α \leq 1

for maximization problems.

Remarks:

The approximation ratio of an algorithm for a minimization problem is the maximum (or supremum), over all instances of the problem, of the ratio between the values of solution returned by the algorithm and the optimal solution. Thus, it is a bound on the worst-case performance of the algorithm.
The approximation ratio $α$ can depend on the size of the instance $I$ , so one should technically write $α (| I |)$ .
A natural question is whether the approximation ratio should be defined in an additive sense. For example, an algorithm has an $α$ -approximation for a minimization problem if it outputs a feasible solution of value at most $OPT (I) + α$ for all $I$ . This is a valid definition and is the more relevant one in some settings. However, for many $N P$ problems it is easy to show that one cannot obtain any interesting additive approximation (unless of course $P = N P$ ) due to scaling issues. We will illustrate this via an example later.

Pros and cons of the approximation approach: Some advantages to the approximation approach include:

It explains why problems can vary considerably in difficulty.
The analysis of problems and problem instances distinguishes easy cases from difficult ones.
The worst-case ratio is robust in many ways. It allows reductions between problems.
Approximation allgorithmic ideas/tools/relaxations are valuable in developing heuristics, including many that are practical and effective.
Quantification of performance via a concrete metric such as the approximation ratio allows for innovation in algorithm design and has led to many new ideas.

As a bonus, many of the ideas are beautiful and sophisticated, and involve connections to other areas of mathematics and computer science.

The focus on worst-case measures risks ignoring algorithms or heuristics that are practical or perform well on average.
Unlike, for example, integer programming, there is often no incremental/continuous tradeoff between the running time and quality of solution.
Approximation algorithms are often limited to cleanly stated problems.
The framework does not (at least directly) apply to decision problems or those that are inapproximable.

Approximation as a broad lens

The use of approximation algorithms is not restricted solely to

N P

-Hard optimization problems. In general, ideas from approximation can be used to solve many problems where finding an exact solution would require too much of any resource.

A resource we are often concerned with is time. Solving

N P

-Hard problems exactly would (to the best of our knowledge) require exponential time, and so we may want to use approximation algorithms. However, for large data sets, even polynomial running time is sometimes unacceptable. As an example, the best exact algorithm known for the Matching problem in general graphs requires

O (m \sqrt{n})

time; on large graphs, this may be not be practical. In contrast, a simple greedy algorithm takes near-linear time and outputs a matching of cardinality at least

1 / 2

that of the maximum matching; moreover there have been randomized sub-linear time algorithms as well.

Another often limited resource is space. In the area of data streams/streaming algorithms, we are often only allowed to read the input in a single pass, and given a small amount of additional storage space. Consider a network switch that wishes to compute statistics about the packets that pass through it. It is easy to exactly compute the average packet length, but one cannot compute the median length exactly. Surprisingly, though, many statistics can be approximately computed.

Other resources include programmer time (as for the Matching problem, the exact algorithm may be significantly more complex than one that returns an approximate solution), or communication requirements (for instance, if the computation is occurring across multiple locations).

1.1 Formal Aspects

1.1.1 NP Optimization Problems

In this section, we cover some formal definitions related to approximation algorithms. We start from the definition of optimization problems. A problem is simply an infinite collection of instances. Let

Π

be an optimization problem.

Π

can be either a minimization or maximixation problem. Instances

I

Π

are a subset of

Σ^{*}

where

Σ

is a finite encoding alphabet. For each instance

I

there is a set of feasible solutions

S (I)

. We restrict our attention to real/rational-valued optimization problems; in these problems each feasible solution

S \in S (I)

has a value

val (S, I)

. For a minimization problem

Π

the goal is, given

I

, find

O P T (I) = min_{S \in S (I)} val (S, I)

Now let us formally define

NP

optimization (

NPO

) which is the class of optimization problems corresponding to

N P

Definition 1.1.

Π

is in

N P O

Given $x \in Σ^{*}$ , there is a polynomial-time algorithm that decide if $x$ is a valid instance of $Π$ *. That is, we can efficiently check if the input string is well-formed. This is a basic requirement that is often not spelled out.
For each $I$ , and $S \in S (I), | S | \leq poly (| I |)$ . That is, the solution are of size polynomial in the input size.
There exists a poly-time decision procedure that for each $I$ and $S \in Σ^{*}$ , decides if $S \in S (I)$ . This is the key property of NP; we should be able to verify solutions efficiently.
$val (I, S)$ is a polynomial-time computable function.

We observe that for a minimization

NPO

problem

Π

, there is a associated natural decision problem

L (Π) = {(I, B) : OPT (I) \leq B}

which is the following: given instance

I

Π

and a number

B

, is the optimal value on

I

at most

B

? For maximization problem

Π

we reverse the inequality in the definition.

Lemma 1.1.

L (Π)

is in

N P

Π

is in

N P O

1.1.2 Relative Approximation

When

Π

is a minimization problem, recall that we say an approximation algorithm

A

is said to have approximation ratio

α

iff

$A$ is a polynomial time algorithm
for all instance $I$ of $Π, A$ produces a feasible solution $A (I)$ s.t. $val (A (I), I) \leq α val (O P T (I), I)$ . (Note that $α \geq 1$ .)

Approximation algorithms for maximization problems are defined similarly. An approximation algorithm

A

is said to have approximation ratio

α

iff

$A$ is a polynomial time algorithm
for all instance $I$ of $Π, A$ produces a feasible solution $A (I)$ s.t. $val (A (I), I) \geq$ $α val (OPT (I), I)$ . (Note that $α \leq 1$ .)

For maximization problems, it is also common to see use

1 / α

(which must be

\geq 1

) as approximation ratio.

1.1.3 Additive Approximation

Note that all the definitions above are about relative approximations; one could also define additive approximations.

A

is said to be an

α

-additive approximation algorithm, if for all

I, val (A (I)) \leq OPT (I) + α

. Most

NPO

problems, however, do not allow any additive approximation ratio because

OPT (I)

has a scaling property.

To illustrate the scaling property, let us consider Metric-TSP. Given an instance

I

, let

I_{β}

denote the instance obtained by increasing all edge costs by a factor of

β

. It is easy to observe that for each

S \in S (I) = S (I_{β}) val (S, I_{β}) = β val (S, I_{β})

and

OPT (I_{β}) = β OPT (I)

. Intuitively, scaling edge by a factor of

β

scales the value by the same factor

β

. Thus by choosing

β

sufficiently large, we can essentially make the additive approximation(or error) negligible.

Lemma 1.2. Metric-TSP does not admit an

α

additive approximation algorithm for any polynomial-time computable

α

unless

P = N P

Proof. For simplicity, suppose every edge has integer cost. For the sake of contradiction, suppose there exists an additive

α

approximation

A

for Metric-TSP. Given

I

, we run the algorithm on

I_{β}

and let

S

be the solution, where

β = 2 α

. We claim that

S

is the optimal solution for

I

. We have

val (S, I) = val (S, I_{β}) / β \leq

OPT (I_{β}) / β + α / β = OPT (I) + 1 / 2

, as

A

α

-additive approximation. Thus we conclude that

OPT (I) = val (S, I)

, since

OPT (I) \leq val (S, I)

, and

OPT (I), val (S, I)

are integers. This is impossible unless

P = N P

Now let us consider two problems which allow additive approximations. In the Planar Graph Coloring, we are given a planar graph

G = (V, E)

. We are asked to color all vertices of the given graph

G

such that for any

v w \in E, v

and

w

have different colors. The goal is to minimize the number of different colors. It is known that to decide if a planar graph admits

3

-coloring is

NP

-complete ^[1], while one can always color any planar graph

G

with using

4

colors (this is the famous

4

-color theorem) ^[2] ^[3]. Further, one can efficiently check whether a graph is

2

-colorable (that is, if it is bipartite). Thus, the following algorithm is a

1

-additive approximation for Planar Graph Coloring: If the graph is bipartite, color it with

2

colors; otherwise, color with

4

colors.

As a second example, consider the Edge Coloring Problem, in which we are asked to color edges of a given graph

G

with the minimum number of different colors so that no two adjacent edges have different colors. By Vizing's theorem^[4], we know that one can color edges with either

Δ (G)

Δ (G) + 1

different colors, where

Δ (G)

is the maximum degree of

G

. Since

Δ (G)

is a trivial lower bound on the minimum number, we can say that the Edge Coloring Problem allows a

1

-additive approximation. Note that the problem of deciding whether a given graph can be edge colored with

Δ (G)

colors is NP-complete^[5].

1.1.4 Hardness of Approximation

Now we move to hardness of approximation.

Definition 1.2 (Approximability Threshold). Given a minimization optimization problem

Π

, it is said that

Π

has an approximation threshold

α^{*} (Π)

, if for any

ϵ > 0

Π

admits a

α^{*} (Π) + ϵ

approximation but if it admits a

α^{*} (Π) - ϵ

approximation then

P = N P

α^{*} (Π) = 1

, it implies that

Π

is solvable in polynomial time. Many

NPO

problems

Π

are known to have

α^{*} (Π) > 1

assuming that

P \neq N P

. We can say that approximation algorithms try to decrease the upper bound on

α^{*} (Π)

, while hardness of approximation attempts to increase lower bounds on

α^{*} (Π)

To prove hardness results on

NPO

problems in terms of approximation, there are largely two approaches; a direct way by reduction from

NP

-complete problems and an indirect way via gap reductions. Here let us take a quick look at an example using a reduction from an

NP

-complete problem.

In the (metric)

k

-center problem, we are given an undirected graph

G = (V, E)

and an integer

k

. We are asked to choose a subset of

k

vertices from

V

called centers. The goal is to minimize the maximum distance to a center, i.e.

min_{S \subseteq V, | S | = k} max_{v \in V} {dist}_{G} (v, S)

, where

{dist}_{G} (v, S) = min_{u \in S} {dist}_{G} (u, v)

The

k

-center problem has approximation threshold

2

, since there are a few

2

-approximation algorithms for

k

-center and there is no

2 - ϵ

approximation algorithm for any

ϵ > 0

unless

P = N P

. We can prove the inapproximability using a reduction from the decision version of Dominating Set: Given an undirected graph

G = (V, E)

and an integer

k

, does

G

have a dominating set of size at most

k

? A set

S \subseteq V

is said to be a dominating set in

G

if for all

v \in V, v \in S

v

is adjacent to some

u

S

. Dominating Set is known to be

NP

-complete.

Theorem 1.3 (^[6]). Unless

P = N P

, there is no

2 - ϵ

approximation for

k

-center for any fixed

ϵ > 0

Proof. Let

I

be an instance of Dominating Set Problem consisting of graph

G = (V, E)

and integer

k

. We create an instance

I^{'}

k

-center while keeping graph

G

and

k

the same. If

I

has a dominating set of size

k

then

OPT (I^{'}) = 1

, since every vertex can be reachable from the Dominating Set by at most one hop. Otherwise, we claim that

OPT (I^{'}) \geq 2

. This is because if

OPT (I^{'}) < 2

, then every vertex must be within distance 1, which implies the

k

-center that witnesses

OPT (I^{'})

is a dominating set of

I

. Therefore, the

(2 - ϵ)

approximation for

k

-center can be used to solve the Dominating Set Problem. This is impossible, unless

P = N P

1.2 Designing Approximation Algorithms

How does one design and more importantly analyze the performance of approximation algorithms? This is a non-trivial task and the main goal of the course is to expose you to basic and advanced techniques as well as central problems. The purpose of this section is to give some high-level insights. We start with how we design polynomial-time algorithms. Note that approximation makes sense mainly in the setting where one can find a feasible solution relatively easily but finding an optimum solution is hard. In some cases finding a feasible solution itself may involve some non-trivial algorithm, in which case it is useful to properly understand the structural properties that guarantee feasibility, and then build upon it.

Some of the standard techniques we learn in basic and advanced undergraduate algorithms courses are recursion based methods such as divide and conquer, dynamic programming, greedy, local search, combinatorial optimization via duality, and reductions to existing problems. How do we adapt these to the approximation setting? Note that intractability implies that there are no efficient characterizations of the optimum solution value.

Greedy and related techniques are often fairly natural for many problems and simple heuristic algorithms often suggest themselves for many problems. (Note that the algorithms may depend on being able to solve some existing problem efficiently. Thus, knowing a good collection of general poly-time solvable problems is often important.) The main difficulty is in analyzing their performance. The key challenge here is to identify appropriate lower bounds on the optimal value (assuming that the problem is a minimization problem) or upper bounds on the optimal value (assuming that the problem is a maximization problem). These bounds allow one to compare the output of the algorithm and prove an approximation bound. In designing poly-time algorithms we often prove that greedy algorithms do not work. We typically do this via examples. This skill is also useful in proving that some candidate algorithm does not give a good approximation. Often the bad examples lead one to a new algorithm.

How does one come up with lower or upper bounds on the optimum value? This depends on the problem at hand and knowing some background and related problems. However, one would like to find some automatic ways of obtaining bounds. This is often provided via linear programming relaxations and more advanced convex programming methods including semi-definite programming, lift-and-project hierarchies etc. The basic idea is quite simple. Since integer linear programming is

NP

-Complete one can formulate most discrete optimization problems easily and "naturally" as an integer program. Note that there may be many different ways of expressing a given problem as an integer program. Of course we cannot solve the integer program but we can solve the linear-programming relaxation which is obtained by removing the integrality constraints on the variables. Thus, for each instance

I

of a given problem we can obtain an

LP

relaxation

L P (I)

which we typically can be solve in polynomial-time. This automatically gives a bound on the optimum value since it is a relaxation. How good is this bound? It depends on the problem, of course, and also the specific

LP

relaxation. How do we obtain a feasible solution that is close to the bound given by the

LP

relaxation. The main technique here is to round the fractional solution

x

to an integer feasible solution

x^{'}

such that

x^{''}

s value is close to that of

x

. There are several non-trivial rounding techniques that have been developed over the years that we will explore in the course. We should note that in several cases one can analyze combinatorial algorithms via

LP

relaxations even though the

LP

relaxation does not play any direct role in the algorithm itself. Finally, there is the question of which

LP

relaxation to use. Often it is required to "strengthen" an

LP

relaxation via addition of constraints to provide better bounds. There are some automatic ways to strengthen any

LP

and often one also needs problem specific ideas.

Local search is another powerful technique and the analysis here is not obvious. One needs to relate the value of a local optimum to the value of a global optimum via various exchange properties which define the local search heuristic. For a formal analysis it is necessary to have a good understanding of the problem structure.

Finally, dynamic programming plays a key role in the following way. Its main use is in solving to optimality a restricted version of the given problem or a subroutine that is useful as a building block. How does one obtain a restricted version? This is often done by some clever proprocessing of a given instance.

Reductions play a very important role in both designing approximation algorithms and in proving inapproximability results. Often reductions serve as a starting point in developing a simple and crude heuristic that allows one to understand the structure of a problem which then can lead to further improvements.

Discrete optimization problems are brittle–changing the problem a little can lead to substantial changes in the complexity and approximability. Nevertheless it is useful to understand problems and their structure in broad categories so that existing results can be leveraged quickly and robustly. Thus, some of the emphasis in the course will be on classifying problems and how various parameters influence the approximability.

Larry Stockmeyer. “Planar 3-colorability is Polynomial Complete”. In: SIGACT News 5.3 (July 1973), pp. 19–25. issn: 0163-5700. doi: 10.1145/ 1008293.1008294. url: http://doi.acm.org/10.1145/1008293.1008294. ↩︎
Kenneth Appel, Wolfgang Haken, et al. “Every planar map is four colorable. Part I: Discharging”. In: Illinois Journal of Mathematics 21.3 (1977), pp. 429–490. ↩︎
Robin Thomas. “An update on the four-color theorem”. In: Notices of the AMS 45.7 (1998), pp. 848–859. ↩︎
Douglas Brent West et al. Introduction to graph theory. Vol. 2. Prentice hall Upper Saddle River, 2001. ↩︎
Ian Holyer. “The NP-completeness of edge-coloring”. In: SIAM Journal on computing 10.4 (1981), pp. 718–720. ↩︎
Wen-Lian Hsu and George L Nemhauser. “Easy and hard bottleneck location problems”. In: Discrete Applied Mathematics 1.3 (1979), pp. 209– 215. ↩︎

Approximation Algorithms: Introduction

1.1 Formal Aspects

1.1.1 NP Optimization Problems

1.1.2 Relative Approximation

1.1.3 Additive Approximation

1.1.4 Hardness of Approximation

1.2 Designing Approximation Algorithms

Recommended for you

Report Article