Optimal Binary Search Trees

node	depth	probability	contribution

k₁	1	0.15	0.30
k₂	0	0.10	0.10
k₃	2	0.05	0.15
k₄	1	0.10	0.20
k₅	2	0.20	0.60
d₀	2	0.05	0.15
d₁	2	0.10	0.30
d₂	3	0.05	0.20
d₃	3	0.05	0.20
d₄	3	0.05	0.20
< d₅	3	0.10	0.40

Total			2.80

For a given set of probabilities, our goal is to construct a binary search tree whose expected search cost is smallest. We call such a tree an optimal binary search tree. Figure 15.7(b) shows an optimal binary search tree for the probabilities given in the figure caption; its expected cost is 2.75. This example shows that an optimal binary search tree is not necessarily a tree whose overall height is smallest. Nor can we necessarily construct an optimal binary search tree by always putting the key with the greatest probability at the root. Here, key k₅ has the greatest search probability of any key, yet the root of the optimal binary search tree shown is k₂. (The lowest expected cost of any binary search tree with k₅ at the root is 2.85.)

As with matrix-chain multiplication, exhaustive checking of all possibilities fails to yield an efficient algorithm. We can label the nodes of any n-node binary tree with the keys k₁, k₂, ..., k_n to construct a binary search tree, and then add in the dummy keys as leaves.

Step 1: The structure of an optimal binary search tree

To characterize the optimal substructure of optimal binary search trees, we start with an observation about subtrees. Consider any subtree of a binary search tree. It must contain keys in a contiguous range k_i, ..., k_j, for some 1 ≤ i ≤ j ≤ n. In addition, a subtree that contains keys k_i, ..., k_j must also have as its leaves the dummy keys d_i-1, ..., d_j.

Now we can state the optimal substructure: if an optimal binary search tree T has a subtree T′ containing keys k_i, ..., k_j, then this subtree T′ must be optimal as well for the subproblem with keys k_i, ..., k_j and dummy keys d_i-1, ..., d_j. The usual cut-and-paste argument applies. If there were a subtree T" whose expected cost is lower than that of T′, then we could cut T′ out of T and paste in T", resulting in a binary search tree of lower expected cost than T, thus contradicting the optimality of T.

We need to use the optimal substructure to show that we can construct an optimal solution to the problem from optimal solutions to subproblems. Given keys k_i, ..., k_j, one of these keys, say k_r (i ≤ r ≤ j), will be the root of an optimal subtree containing these keys. The left subtree of the root k_r will contain the keys k_i, ..., k_r-1 (and dummy keys d_i-1, ..., d_r-1), and the right subtree will contain the keys k_{r 1}, ..., k_j (and dummy keys d_r, ..., d_j). As long as we examine all candidate roots k_r, where i ≤ r ≤ j, and we determine all optimal binary search trees containing k_i, ..., k_r-1 and those containing k_{r 1}, ..., k_j, we are guaranteed that we will find an optimal binary search tree.

There is one detail worth noting about "empty" subtrees. Suppose that in a subtree with keys k_i, ..., k_j, we select k_i as the root. By the above argument, k_i's left subtree contains the keys k_i, ..., k_i-1. It is natural to interpret this sequence as containing no keys. Bear in mind, however, that subtrees also contain dummy keys. We adopt the convention that a subtree containing keys k_i, ..., k_i-1 has no actual keys but does contain the single dummy key d_i-1. Symmetrically, if we select k_j as the root, then k_j's right subtree contains the keys k_{j 1}, ..., k_j; this right subtree contains no actual keys, but it does contain the dummy key d_j.