Lecture 22--MST. Time and space usage. 11/17/97.

================================================================

22.1. Preliminaries. In this lecture we will estimate the time and space usage for each of the MST algorithms described in Lecture 21. It turns out that the most efficient implementations of both algorithms are based on the adjacency list representation of a graph. In the homework the amount of time and space needed if the graph is represented by a matrix will be estimated.

We assume that we are given the standard adjacency list representation of a weighted graph G = (V, E, W), where V = the vertex set of G, |V| = n, E = the edge set of G, |E| = m, and W is a weight function defined on the edge set E, i.e., for each e in E, W(e) represents the weight of e. Thus we assume that each edge structure e contains a field to hold W(e). In addition, we will assume in the following that several values are contained in each vertex structure v. These extra fields increase the memory needed only by small constant multiples.

22.2. Prim's algorithm.

Let us assume the vertices are labeled 0 to n-1. We also define a boolean value stop which becomes true when no more tree edges can be added. If this happens before the number of tree edges is n-1, then we know that G is not connected.

We keep the edges which are candidates for tree inclusion in a priority queue Q, ordered by the weight of the edges. This allows us to select the minimal edge to be added next without searching an edge list.

detailed pseudocode:

(a)      for v = 0 to n-1

set parent(v) = -1 \\the beginning of the edge connecting v \\to the tree

set weight (v) = maxint \\a large initial weight which will \\be replaced by

\\an actual edge weight

set can_reach(v) = false \\becomes true when there is an edge

from a tree vertex to v

(b) set stop = false; set edge_count = 0;

set tree = the empty set;

set weight(0) = 0; set v = 0; \\start with vertex 0, an arbitrary \\choice

(c) while stop = false and edge_count < n-1 do

for y in the adjacency list of v

if can_reach(y) = false then

set can_reach(y) = true

set weight(y) = W((v,y))

set parent(y) = v

insert y in Q

else

if W((v,y)) < weight(y) then

set weight(y) = W((v,y))

set parent(y) = v

insert y in Q (again)

(d) if Q is empty then stop = true

else

set v = the first member of Q

remove v from Q

add (parent(v),v) to the tree

increment edge_count

Space: clearly the space used is "CAPITAL THETA"(n + m).

Time: clearly step (a) takes time = "CAPITAL THETA"(n) and step (b) takes time = "CAPITAL THETA"(1). We know that Q will have at most m elements at any time and so the time to insert or remove an element from Q will be O(log2m). So step (d) will take no more time than O(mlog2m). Now in (c) the body of the for loop is executed as many times as the number of edges incident with vertex v. So (c) takes time = "CAPITAL THETA"(n + mlog2m), which includes the insertion times into Q. Adding up the contribution from each of steps (a)--(d) gives a total time of "CAPITAL THETA"(n + mlog2m).

22.3. Kruskal's algorithm. The space usage for Kruskal's algorithm is similar to the space usage for Prim's algorithm, i.e., "CAPITAL THETA"(n + m).

To implement Kruskal's algorithm with an efficient running time, we need to be able to do two set operations, FIND(v), which finds which component v is in, and UNION(x,y), which combines two components into one. If we represent a component of vertices by a tree (NOT a binary tree) labeled by its root, then FIND(v) returns the root of the tree v is in. this can be done in time O(n), since the tree may have height n. similarly, UNION(x,y) can be accomplished by making, for example, x point to y, thus merging two trees into one. This can be done in constant time. So the total amount of time it takes to do any combination of n unions and finds is O(n2). If we initially sort the edges in G, which takes time "CAPITAL THETA"(mlog2m), and then always take the next smallest edge and combine components by FIND and UNION operations, Kruskal's algorithm will take time "CAPITAL THETA"(n2 + mlog2m), which is worse than Prim's algorithm.

So to make this algorithm competitive with Prim's we need to reduce the time to implement the UNION and FIND operations. We can do this be being more careful about how the trees representing the sets are constructed. If we can control the tree heights better, then we can get a faster running time for the algorithm. To see that this is possible, we prove the following theorem.

Theorem. Suppose we have any program which implements a sequence of n UNION / FIND operations (i.e., any combination of these two operations). If the tree structure is used to represent the sets, and if the union is implemented by always making the tree with fewer nodes point to the tree with more nodes, then this program can be made to run in time O(nlog2n).

The proof of this theorem depends on the following lemma.

Lemma. Suppose that in each tree we keep track, at the root, of the number of nodes in the tree. suppose also that whenever we do a UNION we make the tree with fewer nodes point to the root of the tree containing more nodes. Then any tree constructed this way and containing k nodes will have height <= |_ log2 k _|.

Proof. We prove this by induction.

Step 1. If k = 1, the tree will have height 0, so the claim is true.

Step 2. Assume that if k' < k then any tree constructed as above has

height <= |_ log2 k' _|. Suppose T1 has k1 nodes and height h1 and T2 has k2 nodes and height h2, where k2 <= k1. Then by the induction hypothesis we have

h1 <= |_ log2 k1 _| , h2 <= |_ log2 k2 _|.

Now we may assume that either h1 = h2 or h1 > h2.

If h1 > h2, then the new tree will have height h1.

But h1 <= |_ log2 k1 _| <= |_ log2 (k1 + k2) _| = = |_ log2k _|.

If h1 = h2 then we may assume k1 <= k2 and we may make T2 point to T1, forming a new tree with height h1 + 1. Then we have

h1 + 1 <= |_ log2 k1 _| + 1 <= |_ log2 2k1 _| <= |_ log2 (k1 + k2) _| = |_ log2 k _|.

Step 3. Since Steps 1 and 2 have been verified, by induction, we have proved the lemma as desired.

Proof of theorem. The program will do at most n UNION operations, each of which takes a constant amount of time. It will also do at most n FIND operations. Let us assume that initially each set contains a maximum of d elements, where d is a constant. Then during the running of the program no set will contain more than dn elements. So no FIND operation will take more time than O(log2(dn2)) = O(log2d + 2log2n) = O(log2n). Also, no more than n FIND operations will be executed. So the time to do the FIND operations will be O(nlog2n) and the total running time of the program will thus be O( n + nlog2n) = O(nlog2n).

Theorem. Kruskal's algorithm, as implemented above, will take time O(n + mlog2m ), where n = |V| and m = |E|.

Proof. Initialization takes time O(n). Sorting the edges takes time O(mlog2m). Then the loop which builds the tree will be executed at most m times. This loop does a constant times m UNION / FIND operations, and the size of each set is at most m, since each set contains only vertices connected by the edges examined so far. So this loop will take time O(mlog2m).

Exercise 22.1. Show how the graph G, in Figure 5, is represented:

a. by adjacency matrix

b. by adjacency lists

Exercise 22.2. Show the steps in executing Prim's algorithm for G, starting from vertex 5.

Exercise 22.3. Show the steps in executing Kruskal's algorithm for G. Show the tree structure for the sets constructed at each step.

Exercise 22.4. It is possible for the weighting function on the graph edges to have negative values. Would this affect the correctness of either Prim's algorithm or Kruskal's algorithm?

Kruskal's algorithm. An improvement. By applying a technique known as path compression, the FIND operation can be implemented so that an n-step UNION-FIND program finishes in time O(nG(n)), where G is the "inverse" of Ackermann's function, defined as

H(0) = 1

H(i) = 2H(i-1) , i > 0.

(G(j) is then defined to be the smallest i such that H(i) > j).

Exercise 22.5. Compute H(i) for i <= 5. Compute G(j) for j <= 65536.

Exercise 22. 6. Under what circumstances would each of the MST algorithms given here be preferred>

Exercise 22.7. What would be the estimated space usage and running time for Kruskal's algorithm and Prim's algorithm if an adjacency matrix is used to represent G?