Class 46: Heaps and Heap Sort
Held Tuesday, April 27
- Implementing trees with arrays
- Heaps, an implementation of priority queues
- Heapsort, sorting using heaps
- Code templates due.
- I've received code files from many (but not all) of you. I'll have
them printed and online tomorrow.
- A number of you neglected to include parameters to your methods.
- The use of some of your classes is unclear. Some general documentation
would have helped.
- I'll do my best to fix some problems before distributing the code.
- As you've noted, there are a number of terms we use when talking
about trees. Here are some of them.
- The root is the top or beginning of the tree.
- A node is a part of the tree. (While this has the same
name as the nodes we often use to implement trees, you should think
of it as independent of implementation.)
- Most nodes have one or more children.
- Each node other than the root has a parent.
- Nodes without children are called leaves.
- Nodes with children are called interior nodes.
- The level of a node is the number of steps from root to
- The root is at level 0.
- The direct children of the root are at level 1.
- The children of those nodes are at level 2.
- The depth of a tree is the largest level of any node in the
- The size of a tree is the number of nodes in the tree.
- In a binary tree, no node has more than two children.
- The children are typically designated as left and
- In a complete tree, every level is full (all the interior
nodes have the maximum number of children).
- When considering lists and other structures, we found ways to implement
the structures with both arrays and nodes.
- We've seen how to implement trees with nodes (in fact, that implementation
is so natural that we use the term ``node'' in describing trees).
- Can we also implement trees with arrays?
- It turns out to be relatively easy to implement binary trees (particularly
complete binary trees) with arrays.
- This provides a very convenient way of figuring out where children belong.
- The root of the tree is in location 0.
- The left child of an element stored at location i can
be found in location 2*i+1.
- The right child of an element stored at location i can
be found in location 2*i+2 (also representable as
- The parent of an element stored at location i can be
found at location
- Can we prove all this? Yes, but that's an exercise for another day.
- These properties make it simple to move the cursor around the tree
and to get values. However, they do make it more difficult to
do some operations. For example,
require modifying a large number of cells (since we've decided that
it deletes the old subtree).
- How do we modify heaps? Through insertion and deletion.
- How do we do insertion while maintaining the two key properties?
- We'll maintain one property (near-completeness) and then look
to restore the other (heap order).
- When we add an element to the heap, we just need to put the element at
the end of the lowest level.
- If that level is full, we put it at the left end of the next level.
- After removing the smallest element from the heap, we have a ``hole''
at the top of the heap. We put the last element in the last level in
that hole (which may sound like an odd idea, but at least it maintains
the near-completeness property).
- Both of these operations may violate the heap order (in fact, the second
one is almost guaranteed to violate heap order), so we need to rearrange
- This rearrangement is often called percolating. When we put
an element at the bottom, we percolate up. When we put an
element at the top, we percolate down.
- Recall that to add an element to a heap we,
- put the element at the end of the lowest level and
- percolate up.
- Percolating up is fairly simple
- At each step, we compare the percolated node to its parent.
- If the percolated node is smaller (violating the heap property), we swap
the two and continue up the tree.
- Otherwise, we're done, since the heap property is no longer violated.
- When we do the swap,
- the subtree that contains the old parent is clearly in heap order
(the old parent was an ancestor to all the nodes in that
subtree, and therefore smaller) and
- the present node is clearly smaller
than both of its new subtrees (it's smaller than the old parent, and
the old parent was smaller than everything else below it).
- Eventually, we stop (either because we no longer violate heap property
or because we reach the root).
- Here's an example, based on inserting the values 5, 6, 4, 4, 7, 2
- How much time does this take? Well, the depth of a complete binary
tree with n nodes is O(log2(n)), and the algorithm may
require swapping up from leaf to root, so the running time is also
- We can use the heap structure to provide a fairly simple and quick
sorting algorithm. To sort a set of n elements,
- insert them into a heap, one-by-one.
- remove them from the heap in order.
- What's the running time?
- There are n insertions. Each takes O(log2(n)) time by our
- There are n ``delete least'' operations.
Each takes O(log2(n)) by our prior analysis.
- Hence, we've developed yet another O(n*log2(n)) sorting algorithm.
- It is not an in-place sorting algorithm, and does require O(n)
extra space for the heap.
- Most people implement heap sort with array-based trees. Some even
define heap sort completely in terms of the array operations, and
forget the origins.
- Created Monday, January 11, 1999.
- Added short summary on Friday, January 22, 1999.
- Filled in the details on Tuesday, April 27, 1999.
Much of the outline was based on
outline 45 of