Lecture 21. The greedy method. Minimum spanning trees. 11/14/97.

================================================================

NOTE: The figures referred to in this lecture and the following lectures will be provided as handouts in class.

===============================================================

21.1. Optimization problems. The greedy strategy. An optimization problem is one which requires us to minimize or maximize the value of some function. for example, we may want to find the shortest path between two vertices in a graph. Or we may be given a function relating profits to such variables as manufacturing costs, transportation costs, and advertising costs, and be asked to find values for the different costs to maximize the profits.

In calculus we learn that candidate maxima and minima for a differentiable function on an interval occur at endpoints of the interval and where the derivative is zero. For discrete problems it is often the case that an optimization problem can be solved by the simple strategy of always making the choice that looks best at a given time. This simple strategy is called, for obvious reasons, the greedy strategy.

Sometimes the greedy strategy of making locally optimal choices does not lead to a globally optimal solution. For example, we can define a greedy procedure for finding the maximum or minimum value of a function f(x) on an interval A <= x <= B as follows: choose any x in the interval. Choose a small step size delta. Compute f(x), f(x+delta), and f(x-delta). Use these values to determine whether to accept f(x) as the minimum or to try a new x' either to the right or left of x. It is easy to draw a function and to choose an initial value for x for which this "greedy" method gets stuck at a local minimum which is not the absolute minimum on the interval A <= x <= B. (Example: condsider the fourth degree polynomial 3x4 - 4x3 -12x2 on the interval -3 <= x <= 3. The minimum occurs at x = 2, but if we start our greedy serach at x = -0.5 and use delta = .01, we will find the local minimum at x = -1 instead. In Algorithms II we will look at some other methods for trying to solve optimization problems. In this and the next lecture, however, we will look at a problem which can be solved by the greedy method, finding the minimum spanning tree of a graph.

21.2. The minimum spanning tree problem. Suppose we are given a weighted graph G. That is, G = (V,E) and for each edge e in E there is given a weight W(e) >= 0. A spanning tree for G is a tree which is a subgraph of G and contains all the vertices of G. A minimum spanning tree (MST) is one for which the sum of the weights on its edges is <= the sum of the weights of the edges of any other spanning tree for G (Figure 1). Finding a MST is a problem that needs to be solved in many different situations. For example, we might want to connect several terminals in a circuit to form what is called a net; the weights of the connecting edges would be the lengths of the wires used to form the connections. We want to use the least amount of wire poossible, i.e., we are looking for a MST.

Now clearly if G is not connected then G will not have a MST.

Note also that a graph G can have more than one MST (Figure 2).

Also note that if a graph G is connected then G must have a minimum spanning tree. This is because G will have at least one spanning tree (just remove edges of G one at a time until all cycles have been removed but the vertices are still connected). In fact, we could list all possible spanning trees of G. Then a minimum spanning tree is any one of these trees which has minimal weight.

We give two algorithms for finding a MST for a graph G. Both use a greedy strategy.

21.3. Prim's algorithm.

Suppose we are given G = (V,E).  We assume G is connected.  (If not, then
the algorithm will find a minimal spanning tree for the component we
happen to start in.)

Let B be the set of tree vertices, initially empty.

Let T be the set of tree edges, initially empty.

Choose any v in V.

Set B = B "UNION" {v}.

While B <> V do

select the mionimum weight edge (u,w) with u in V - B, w in B

set T = T "UNION" { (u.w)}.

set B = B "UNION" {u}

Example. If G is the graph in Figure 2, then initially choose v = vertex 1. We get the following steps:

Iteration      B          V - B      edge chosen      T

1 {1} {2,3,4,5} (1,5) { (1,5)}

2 {1,5} {2,3,4} (1,2) {(1,5),(1,2)}

3 {1,2,5} {3,4} (2,3) {(1,5),(1,2),(2,3)}

4 {1,2,3,5} {4} (3,4) {(1,5),(1,2),(2,3),

(3,4)}

Note that at step 3 we could have chosen to add edge (2,4) instead of (2,3).

At step 4 we could have chosen to add edge (2,4) instead of (3,4).

Now it is fairly easy to see that the algorithm finds a spanning tree for G. But is it a minimal spanning tree? To see that it is, we need the following lemma.

Lemma. Suppose E1 is a subset of E with the property that E1 is a subset of the edges in a minimal spanning tree T for G. Let V1 be the set of vertices incident with edges in E1. Let (u,v) be an edge of minimal weight with the property that u is in V - V1 and v is in V1. Then E1 union {(u,v)} is also a subset of a minimal spanning tree. (figure 3).

Proof. If the edge (u,v) is in the minimal spanning tree T, then we are done. If (u,v) is not in T, on the other hand, then there is a path from u to v in T. Let (x,y) be the edge in this path with exactly one vertex in V1. Call this vertex x. (Figure 4). Let T1 be T with edge (x,y) removed and edge (u,v) added. Then E1 union {(u,v)} is contained in T1 and T1 is a spanning tree. Now by the choice of (u,v) we know that the weight of (u,v) is less than or equal to the weight of (x,y). Therefore the weight of T1 is less than or equal to the weight of T, i.e., T1 is a minimal spanning tree for G.

21.4. Kruskal's Algorithm.. Another algorithm for finding a minimum spanning tree uses the set data structure. Let G be a connected graph with n vertices and nonnegative edge weights.

Initialize n components, each one containing one vertex of G.

Now sort the edges in increasing order by weight and set T = the empty set.

Now examine each edge in turn. If an edge joins two components, add it to T and merge the two components into one. If not, discard the edge.

Stop when only one component remains.

Example.Consider the graph in Figure 2.

Sorted edges: (1,5), (2,4), (2,3), (3,4), (1,2), (4,5) Step Components add T 1 {1}.{2},{3},{4},{5} (1,5) (1,5) 2 {1,5},{2},{3},{4} (2,4) (1,5),(2,4) 3 {1,5},{2,4},{3} (2,3) (1,5),(2,4),(2,3) 4 {1,5},{2,3,4} (1,2) (1,5),(2,4),(2,3),(1,2)