The Algorithms Of Kruskal And Prim

The algorithms of Kruskal and Prim: The two minimum-spanning-tree algorithms described in this section are elaborations of the generic algorithm. They each use a specific rule to determine a safe edge in line 3 of GENERIC-MST. In Kruskal's algorithm, the set A is a forest. The safe edge added to A is always a least-weight edge in the graph that connects two distinct components. In Prim's algorithm, the set A forms a single tree. The safe edge added to A is always a least-weight edge connecting the tree to a vertex not in the tree.

Kruskal's algorithm: Kruskal's algorithm is based directly on the generic minimum-spanning-tree algorithm given in Growing a minimum spanning tree. It finds a safe edge to add to the growing forest by finding, of all the edges that connect any two trees in the forest, an edge (u, v) of least weight. Let C₁ and C₂ denote the two trees that are connected by (u, v). Since (u, v) must be a light edge connecting C₁ to some other tree. Kruskal's algorithm is a greedy algorithm, because at each step it adds to the forest an edge of least possible weight.

Our implementation of Kruskal's algorithm is like the algorithm to compute connected components from Disjoint-set operations. It uses a disjoint-set data structure to maintain several disjoint sets of elements. Each set contains the vertices in a tree of the current forest. The operation FIND-SET(u) returns a representative element from the set that contains u. Thus, we can determine whether two vertices u and v belong to the same tree by testing whether FIND-SET(u) equals FIND-SET(v). The combining of trees is accomplished by the UNION procedure.

	MST-KRUSKAL(G, w)
1  A ← Ø
2  for each vertex v ∈ V[G]
3       do MAKE-SET(v)
4  sort the edges of E into nondecreasing order by weight w
5  for each edge (u, v) ∈ E, taken in nondecreasing order by weight
6       do if FIND-SET(u) ≠ FIND-SET(v)
7             then A ← A ∪ {(u, v)}
8                  UNION(u, v)
9  return A

Kruskal's algorithm works as shown in Figure 23.4. Lines 1-3 initialize the set A to the empty set and create |V| trees, one containing each vertex. The edges in E are sorted into nondecreasing order by weight in line 4. The for loop in lines 5-8 checks, for each edge (u, v), whether the endpoints u and v belong to the same tree. If they do, then the edge (u, v) cannot be added to the forest without creating a cycle, and the edge is discarded. Otherwise, the two vertices belong to different trees. In this case, the edge (u, v) is added to A in line 7, and the vertices in the two trees are merged in line 8.

Figure 23.4: The execution of Kruskal's algorithm on the graph from Figure 23.1. Shaded edges belong to the forest A being grown. The edges are considered by the algorithm in sorted order by weight. An arrow points to the edge under consideration at each step of the algorithm. If the edge joins two distinct trees in the forest, it is added to the forest, thereby merging the two trees.

The running time of Kruskal's algorithm for a graph G = (V, E) depends on the implementation of the disjoint-set data structure. We shall assume the disjoint-set-forest implementation of Growing a minimum spanning treewith the union-by-rank and path-compression heuristics, since it is the asymptotically fastest implementation known. Initializing the set A in line 1 takes O(1) time, and the time to sort the edges in line 4 is O(E lg E). (We will account for the cost of the |V| MAKE-SET operations in the for loop of lines 2-3 in a moment.) The for loop of lines 5-8 performs O(E) FIND-SET and UNION operations on the disjoint-set forest. Along with the |V| MAKE-SET operations, these take a total of O((V E) α(V)) time, where α is the very slowly growing function defined in Analysis of union by rank with path compression. Because G is assumed to be connected, we have |E| ≥ |V| - 1, and so the disjoint-set operations take O(E α(V)) time. Moreover, since α(|V|) = O(lg V) = O(lg E), the total running time of Kruskal's algorithm is O(E lg E). Observing that |E| < |V|², we have lg |E| = O(lg V), and so we can restate the running time of Kruskal's algorithm as O(E lg V).