Dynamic Tables
Table expansion and contraction: To implement a TABLE-DELETE operation, it is simple enough to remove the specified item from the table. It is often desirable, however, to contract the table when the load factor of the table becomes too small, so that the wasted space is not exorbitant. Table contraction is analogous to table expansion: when the number of items in the table drops too low, we allocate a new, smaller table and then copy the items from the old table into the new one. The storage for the old table can then be freed by returning it to the memory-management system. Ideally, we would like to preserve two properties:
-
the load factor of the dynamic table is bounded below by a constant, and
-
the amortized cost of a table operation is bounded above by a constant.
We assume that cost can be measured in terms of elementary insertions and deletions.
A natural strategy for expansion and contraction is to double the table size when an item is inserted into a full table and halve the size when a deletion would cause the table to become less than half full. This strategy guarantees that the load factor of the table never drops below 1/2, but unfortunately, it can cause the amortized cost of an operation to be quite large. Consider the following scenario. We perform n operations on a table T , where n is an exact power of 2. The first n/2 operations are insertions, which by our previous analysis cost a total of Φ(n). At the end of this sequence of insertions, num[T] = size[T] = n/2. For the second n/2 operations, we perform the following sequence:
I, D, D, I, I, D, D, I, I, ... ,
where I stands for an insertion and D stands for a deletion. The first insertion causes an expansion of the table to size n. The two following deletions cause a contraction of the table back to size n/2. Two further insertions cause another expansion, and so forth. The cost of each expansion and contraction is Θ(n), and there are Θ(n) of them. Thus, the total cost of the n operations is Θ(n2), and the amortized cost of an operation is Θ(n).
- The difficulty with this strategy is obvious: after an expansion, we do not perform enough deletions to pay for a contraction. Likewise, after a contraction, we do not perform enough insertions to pay for an expansion.
- We can improve upon this strategy by allowing the load factor of the table to drop below 1/2. Specifically, we continue to double the table size when an item is inserted into a full table, but we halve the table size when a deletion causes the table to become less than 1/4 full, rather than 1/2 full as before. The load factor of the table is therefore bounded below by the constant 1/4. The idea is that after an expansion, the load factor of the table is 1/2. Thus, half the items in the table must be deleted before a contraction can occur, since contraction does not occur unless the load factor would fall below 1/4. Likewise, after a contraction, the load factor of the table is also 1/2. Thus, the number of items in the table must be doubled by insertions before an expansion can occur, since expansion occurs only when the load factor would exceed 1.
- We omit the code for TABLE-DELETE, since it is analogous to TABLE-INSERT. It is convenient to assume for analysis, however, that if the number of items in the table drops to 0, the storage for the table is freed. That is, if num[T] = 0, then size[T] = 0.
- We can now use the potential method to analyze the cost of a sequence of n TABLE-INSERT and TABLE-DELETE operations. We start by defining a potential function Φ that is 0 immediately after an expansion or contraction and builds as the load factor increases to 1 or decreases to 1/4. Let us denote the load factor of a nonempty table T by α(T) = num[T]/ size[T]. Since for an empty table, num[T] = size[T] = 0 and α[T] = 1, we always have num[T] = α(T) · size[T], whether the table is empty or not. We shall use as our potential function
Observe that the potential of an empty table is 0 and that the potential is never negative. Thus, the total amortized cost of a sequence of operations with respect to Φ is an upper bound on the actual cost of the sequence.
Before proceeding with a precise analysis, we pause to observe some properties of the potential function. Notice that when the load factor is 1/2, the potential is 0. When the load factor is 1, we have size[T] = num[T], which implies Φ(T) = num[T], and thus the potential can pay for an expansion if an item is inserted. When the load factor is 1/4, we have size[T] = 4 · num[T], which implies Φ(T) = num[T], and thus the potential can pay for a contraction if an item is deleted. Figure 17.4 illustrates how the potential behaves for a sequence of operations.
To analyze a sequence of n TABLE-INSERT and TABLE-DELETE operations, we let ci denote the actual cost of the ith operation, denote its amortized cost with respect to Φ, numi denote the number of items stored in the table after the ith operation, sizei denote the total size of the table after the ith operation, αi denote the load factor of the table after the ith operation, and Φi denote the potential after the ith operation. Initially, num0 = 0, size0 = 0, α0 = 1, and Φ0 = 0.
We start with the case in which the ith operation is TABLE-INSERT. The analysis is identical to that for table expansion in Section 17.4.1 if αi-1 ≥ 1/2. Whether the table expands or not, the amortized cost of the operation is at most 3. If αi-1 < 1/2, the table cannot expand as a result of the operation, since expansion occurs only when αi-1 = 1. If αi < 1/2 as well, then the amortized cost of the ith operation is