[Skip to Body]
Primary: [Front Door] [Current] [Glance] - [Honesty] [On Teaching and Learning]
Groupings: [EBoards] [Examples] [Exams] [Handouts] [Homework] [Labs] [Outlines] [Readings] [Reference]
Misc: [SamR] [Java 1.5 API] [Espresso] [TAO of Java] [CS152 2004F] [CS152 2005S] [CS152 2005F]

Class 32: Heaps and Heap Sort

Back to Priority Queues. On to Algorithm Analysis (1).

Held: Friday, March 17, 2006

Summary: Today we consider the heap, an important implementation of priority queues. We also note how heaps can be used to derive a useful sorting algorithm.

Related Pages:

EBoard.

Notes:

No homework for break. No readings. No coding. No nothing.
Would you say There is a quarter and two dimes on the desk or There are a quarter and two dimes on the desk?

Overview:

Review: Priority Queues.
Trees.
Tree Terminology.
Heaps.
Implementing Heaps.

Priority Queues, Revisited

You may recall that we've been working with priority queues, linear structures which provide the highest-priority-first rubric.
We've come up with a number of implementations:
- We implemented an unsorted array of values
What are the running times for put and get?
Can we do better?

A Binary Implementation

We can use the divide and conquer to design this data structure.
- In algorithms, we divide the problem up into parts.
- In data structures, we can build structures that naturally have two halves (plus something that holds-together the two halves).
- Each half will also have its own two halves, and so on and forth.
This divided structure looks something like a tree (okay, an upside-down tree or a unisex family tree). Because each node has two subtrees, we call it a binary tree.

We might express that technique with the following Java class

public class BinaryTreeNode<T>
{

  // +--------+--------------------------------------------------
  // | Fields |
  // +--------+

  /** The value associated with this node. */
  private T value;

  /** 
   * Half of the remaining elements.  Set to null if there are
   * no other elements.
   */
  private BinaryTreeNode<T> left;

  /**
   * The other half of the remaining elements.  Set to null
   * if there are no other elements.
   */
  private BinaryTreeNode<T> right;

  /**
   * The comparator used to determine priorities.
   */
  private Comparator<T> prioritize;

  // ...
} // BinaryTreeNode<T>

Trees are common data structures. We'll revisit them a few times in the coming weeks.

Detour: Tree Terminology

Heaps are a kind of tree. Hence, it is important that we consider some basic tree terminology
The root is the top or beginning of the tree.
A node is a part of the tree. (While this has the same name as the nodes we often use to implement trees and lists, you should think of it as independent of implementation.)
Most nodes have one or more children.
Each node other than the root has a parent.
Nodes without children are called leaves.
Nodes with children are called interior nodes.
The level of a node is the number of steps from root to that node.
- The root is at level 0.
- The direct children of the root are at level 1.
- The children of those nodes are at level 2.
- ...
The depth of a tree is the largest level of any node in the tree.
The size of a tree is the number of nodes in the tree.
In a binary tree, no node has more than two children.
- The children are typically designated as left and right.
In a complete tree, every level is full (all the interior nodes have the maximum number of children).
We'll return to trees in the weeks to come. For today, we'll stick with the simple heaps we've just defined.

An Introduction to Heaps

Heaps are a particular form of binary tree designed to provide quick access to the highest-priority element in the tree.
Heaps must be balanced (it's part of the definition).
In particular, a heap is
- a binary tree,
- that is nearly complete in that
  - at most one node has one child (the rest have zero or two)
  - the nodes on the last level are at the left-hand-side of the level (that is, if a node has one or two children, all of its left siblings have two children)
- and that has the heap property: the value stored in each node is of higher priority (lower value) than the values stored below it.

The Heap Property

An essential aspect of heaps is the heap property.
We can talk about a global heap property (that the value stored at one node in the tree is of higher priority than anything stored below it.
We can also speak about a local heap property (that the value stored at one node is of higher priority than the two values stored directly below it).
If the local heap property holds everywhere in the tree, then the global heap property holds everywhere in the tree.
- Consider the path; we can only be getting larger.

Examples

Here are some heaps of varying sizes

    2     2   2    2      2       2      3
   / \       /    / \    / \     / \    / \
  3   7     3    3   7  3   7   3   7  3   3
 / \  |               /        / \
9   7 8               9       9   7

Here are some non-heaps. Can you tell why?

    2          2      2        2      2
   / \        / \    / \      /|\     |
  3   7      9   7  7   3    3 3 3    3
 /   / \    / \        / \           / \
9   8   8  9   7      9   7         4   7

Implementing Heaps with Arrays

When considering lists and other structures, we found ways to implement the structures with both arrays and nodes.
- It seems likely that we can implement heaps with special nodes (as we did at the beginning of class).
- Can we also implement heaps with arrays?
It turns out to be relatively easy to implement binary heaps and other binary trees, particularly complete binary trees with arrays.
How?
- Assume we have a complete binary tree in that every interior (nonleaf) node has exactly two children.
- Number the nodes, starting at the top and working across each level. (The root is node 0, its left child is node 1, the root's right child is node 2, node 1's left child is node 3, node 1's right child is node 4, ...).
```
       0
     /   \
    1     2
   / \   / \
  3   4 5   6
 / \
7   8
```
- This numbering gives you the positions in the array for each element.
- (If you don't want to build complete trees and are willing to waste space, you can store a special value to represent nothing at this position.)
This provides a very convenient way of figuring out where children belong.
- The root of the tree is in location 0.
- The left child of an element stored at location i can be found in location 2*i+1.
- The right child of an element stored at location i can be found in location 2*i+2 (also representable as 2*(i+1)).
The parent of an element stored at location i can be found at location floor((i-1)/2).
Can we prove all this? Yes, but that's an exercise for another day.
These properties make it simple to move the cursor around the tree and to get values.
Note that we have an interesting double indirection here.
- We've decided to implement priority queues with this divide-and-conquer structure.
- We've decided to implement this divide-and-conquer structure with arrays.
That is, we've given an implementation of an implementation of a data structure.

Heap Sort

We can use the heap structure to provide a fairly simple and quick sorting algorithm. To sort a set of n elements,
- insert them into a heap, one-by-one.
- remove them from the heap in order.
Can we do this in place (provided, of course, that our original information was in an array)? You'll need to think about it.
Most people implement heap sort with array-based heaps. Some even define heap sort completely in terms of the array operations, and forget the origins.
You still need to extract the values from the heap.

Back to Priority Queues. On to Algorithm Analysis (1).

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue May 9 08:31:44 2006.
The source to the document was last modified on Thu Jan 12 14:58:06 2006.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2006S/Outlines/outline.32.html.

You may wish to validate this document's HTML ; ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu