CSC153 2004S, Class 50: Priority Queues
Admin
* Lots to cover today; avoid distractions
* HW7 due.
* HW8 (final homework) assigned
* Cite stuff you find on the Web.
* No exam 4!
* Cool convo today: DNA Evidence
* Cool protest today: Light polution ("We want heavy polution!")
* A travesty of a play tonight.
Overview:
* Priority queue basics
* Simple implementations
* Detour: Binary trees
* Heaps
* Implementing heaps
* Heapsort
Priority Queues:
* Not queues (which have a FIFO policy)
* Are linear structures
* The value returned by get and peek is determined by a "priority" assigned to each value in the queue
* Techniques for assigning values:
* Attach a "priority" number to each value in the queue
* Use a Comparator (like an Order) to compare things (slightly more general)
Observation:
* Priority queues can be a nice technique for sorting.
* A good implementation of priority queues may lead to a good sorting algorithm.
How do we implement priority queues?
* Use a dynamic array or a linked list and keep the values in sorted order.
add - O(n)
get - O(1) if done well
[Effectively insertion sort]
* Use a dynamic array or linked list and store the values in no particular order.
add - O(1)
get - O(n) to find the smallest value
How do we normally improve/design algorithms?
* Brute force
* Greediness
* Dynamic programming (store intermediate values in a table)
* Divide and conquer
* Binary search
* Quicksort
* Mergesort
* We can also apply divide and conquer to data structure design
Typical d-a-c data structure: The tree (much like a family tree)
* A tree is built from node-like things
* Each has an associated value
* Each has zero or more "children", which are also nodes/trees
* In the most general case, there is no relationship between the nodes (or their values) other than the "child" relationship.
* However, we often add other relationships.
Some terminology:
* root: The starting point of the tree.
* leaf: Nodes with no children
* interior node: something neither a root nor a leaf
* parent: inverse of child relationship
* [forest: a collection of trees (should not share nodes)]
* [graph: like a tree, but nodes can be shared and we use different terms for the relationships]
* descendant: child, child of child, child of child of child, etc.
* ancestor: inverse of descendant relationship
* sibling: a node with the same parent
* depth: the number of nodes in the longest path from root to leaf
* level: all nodes the same distance from the root
* binary tree: a tree in which all the nodes have <= 2 children
* complete binary tree: a tree of n levels in which the nodes at levels 0 ... n-1 all have two children
* nearly-complete binary tree: a tree of n levels in which the nodes at levels 0 .. n-2 all have two children and, at level n-1, if a node has one or two children than all of its left siblings have two children
* Informally: Everything but the last level is full; the last level is "shoved to the left"
* nodes are ungendered
Heap: An implementation of priority queues with binary trees
* The tree must be nearly complete (depth of the tree is as small as possible; O(logn) a tree of k levels has 2^k-1 nodes
* The tree must have the heap property
* The priority of each node is at least as high as the priority of its children
How long does it take to find the highest priority element in a heap (peek)?
* O(1) because the root is the highest priority element
How do we add something to a heap?
* Maintain near completeness - Add to the leftmost part of the bottom level
* Restore the heap property by repeatedly swapping the value "up" the tree if it has a higher priority than its parent
Theorem (unproven): If it was a heap to start with, it will be a heap when we're done
* O(logn)
How do we remove the highest-priority element from a heap?
* Maintain near completeness by putting the rightmost element at the lowest level at the root
* Restore the heap property by repeatedly "swapping down" with the larger of the two children
* O(logn)
Problem in our analysis: How do you find the leftmost free position or the rightmost element?
* Do math to figure it out
* Keep a current pointer to the next free position in the tree (or the last filled position, or both):
Upon addition: If you are the left child of your parent
An interesting variant: Shove the values in an array/vector using a breadth-first left-to-right numbering of the positions
* Left child of node at position k is at position: 2k+1
* Right child of node at position k is at position: 2k+2
* Parent of a node at position k is at position (k-1) div 2
* This makes it much easier to find the next free space (just keep track of the number of values in the array/Vector)
* This saves all the child/parent pointer crap.
Note that you can take an existing array and turn it into a heap fairly quickly
* Grow the heap from left to right
* Build a forest and repeatedly join