Elements Of Dynamic Programming

Subtleties

One should be careful not to assume that optimal substructure applies when it does not. Consider the following two problems in which we are given a directed graph G = (V, E) and vertices u, v ∈ V.

Unweighted shortest path: Find a path from u to v consisting of the fewest edges. Such a path must be simple, since removing a cycle from a path produces a path with fewer edges.
Unweighted longest simple path: Find a simple path from u to v consisting of the most edges. We need to include the requirement of simplicity because otherwise we can traverse a cycle as many times as we like to create paths with an arbitrarily large number of edges.

The unweighted shortest-path problem exhibits optimal substructure, as follows. Suppose that u ≠ v, so that the problem is nontrivial. Then any path p from u to v must contain an intermediate vertex, say w. (Note that w may be u or v.) Thus we can decompose the path into subpaths . Clearly, the number of edges in p is equal to the sum of the number of edges in p₁ and the number of edges in p₂. We claim that if p is an optimal (i.e., shortest) path from u to v, then p₁ must be a shortest path from u to w. Why? We use a "cut-and-paste" argument: if there were another path, say p'₁, from u to w with fewer edges than p₁, then we could cut out p₁ and paste in p'₁ to produce a path with fewer edges than p, thus contradicting p's optimality. Symmetrically, p₂ must be a shortest path from w to v. Thus, we can find a shortest path from u to v by considering all intermediate vertices w, finding a shortest path from u to w and a shortest path from w to v, and choosing an intermediate vertex w that yields the overall shortest path. In Section 25.2, we use a variant of this observation of optimal substructure to find a shortest path between every pair of vertices on a weighted, directed graph.

It is tempting to assume that the problem of finding an unweighted longest simple path exhibits optimal substructure as well. After all, if we decompose a longest simple path into subpaths , then mustn't p₁ be a longest simple path from u to w, and mustn't p₂ be a longest simple path from w to v? The answer is no! Figure 15.4 gives an example. Consider the path q → r → t, which is a longest simple path from q to t. Is q → r a longest simple path from q to r? No, for the path q → s → t → r is a simple path that is longer. Is r → t a longest simple path from r to t? No again, for the path r → q → s → t is a simple path that is longer.

Figure 15.4: A directed graph showing that the problem of finding a longest simple path in an unweighted directed graph does not have optimal substructure. The path q → r → t is a longest simple path from q to t, but the subpath q → r is not a longest simple path from q to r, nor is the subpath r → t a longest simple path from r to t.

This example shows that for longest simple paths, not only is optimal substructure lacking, but we cannot necessarily assemble a "legal" solution to the problem from solutions to subproblems. If we combine the longest simple paths q → s → t → r and r → q → s → t, we get the path q → s → t → r → q → s → t, which is not simple. Indeed, the problem of finding an unweighted longest simple path does not appear to have any sort of optimal substructure. No efficient dynamic-programming algorithm for this problem has ever been found. In fact, this problem is NP-complete.

What is it about the substructure of a longest simple path that is so different from that of a shortest path? Although two subproblems are used in a solution to a problem for both longest and shortest paths, the subproblems in finding the longest simple path are not independent, whereas for shortest paths they are. What do we mean by subproblems being independent? We mean that the solution to one subproblem does not affect the solution to another subproblem of the same problem. For the example of Figure 15.4, we have the problem of finding a longest simple path from q to t with two subproblems: finding longest simple paths from q to r and from r to t. For the first of these subproblems, we choose the path q → s → t → r, and so we have also used the vertices s and t. We can no longer use these vertices in the second subproblem, since the combination of the two solutions to subproblems would yield a path that is not simple. If we cannot use vertex t in the second problem, then we cannot solve it all, since t is required to be on the path that we find, and it is not the vertex at which we are "splicing" together the subproblem solutions (that vertex being r). Our use of vertices s and t in one subproblem solution prevents them from being used in the other subproblem solution. We must use at least one of them to solve the other subproblem, however, and we must use both of them to solve it optimally. Thus we say that these subproblems are not independent. Looked at another way, our use of resources in solving one subproblem (those resources being vertices) has rendered them unavailable for the other subproblem.

Why, then, are the subproblems independent for finding a shortest path? The answer is that by nature, the subproblems do not share resources. We claim that if a vertex w is on a shortest path p from u to v, then we can splice together any shortest path and any shortest path to produce a shortest path from u to v. We are assured that, other than w, no vertex can appear in both paths p₁ and p₂. Why? Suppose that some vertex x ≠ w appears in both p₁ and p₂, so that we can decompose p₁ as and p₂ as . By the optimal substructure of this problem, path p has as many edges as p₁ and p₂ together; let's say that p has e edges. Now let us construct a path from u to v. This path has at most e - 2 edges, which contradicts the assumption that p is a shortest path. Thus, we are assured that the subproblems for the shortest-path problem are independent.

In matrix-chain multiplication, the subproblems are multiplying subchains A_iA_{i 1} A_k and A_{k 1}A_{k 2} A_j. These subchains are disjoint, so that no matrix could possibly be included in both of them. In assembly-line scheduling, to determine the fastest way through station S_i,j, we look at the fastest ways through stations S_1,j-1 and S_2,j-1. Because our solution to the fastest way through station Si, j will include just one of these subproblem solutions, that subproblem is automatically independent of all others used in the solution.