CSC152, Class 36: Heaps and Heap Sort Admin: * Missing: BillupsM * Syllabus rearranged * No homework for Friday * Yesterday's eboard attempts to summarize the whiteboard * New project stuff assigned * Questions on the exam? * Will we have at least five points of "Sam makes spelling mistakes"? Almost certainly. * Should TestNABQ.java compile? Yes. * Will Sam be in his office tomorrow? He hopes so. Overview: * Implementing trees * Tree terminology * Heap basics * Implementing heap operations * Storing heaps in arrays * Heap sort /Trees/ * A "two-dimensional" version of lists * A binary tree is either * Empty * Combination of value, left subtree, right subtree * A general tree is either * Empty * Combination of value and 0 or more subtrees * One possible implementation: public class TreeNode { Object value; TreeNode left; TreeNode right; ... } * Terminology: * "Binary" no more than two subtrees * "Root" the starting point of the tree (like car) * "Leaf" no subtrees * "Complete Binary Tree": Every non-leaf node has exactly two subtrees * "Depth" - Distance between the root and a value * Depth of the root is 0 * Depth of the roots of each subtree of the root is 1 * Depth of the roots of their subtrees is 2 * Etc. * "Level" - The set of all values at a particular depth * Claveria's Claim: If there are N values in a tree and the tree is complete (or nearly complete), then the maximum depth is approximately log_2(N) * Proof? Take 218 * Proof: * 1 values, 0 depth * 3 values: 1 depth * 7 values: 2 depth * 15 values: 3 depth * 31 values: 4 depth * When you see a function increase by one each time N doubles, it is logarithmic (by defn) * Moral: You can put a lot of values in a fairly shallow tree. * Moral: If you can make your operations depend on the depth of the tree, you win. Heap: A tree-based implementation of priority queues Rules: * (1) A heap is a nearly-complete binary tree. Every level but the last is complete. Everything at the last level is "shoved to the left" * (2) The heap property states that every value is of higher priority than the values in the subtrees * Interesting fact: To check the global heap property, we only need to check local heap property (value vs. values at two subtrees) What are the methods we need to implement for priority queues? * peek(): Find the highest-priority (smallest) value Easy: Look at the root Time: O(1) * get(): Find and remove the highest-priority (smallest) value * put(Object value): Add something Failed idea for put(): Start at the top and work your way down New idea for put(): Put it at the right end of the lowest level and clean up after yourself * Warning! How do we find "the right end of the lowest level" * To "clean up after yourself" (that is, restore the heap property), repeatedly swap a value with the larger value above it. * Time: O(logn) Idea for get()? * Failed: Repeatedly pick the smallest of the subtree values and swap it up * Problem: Leaves a "hole" at an unexpected place; no longer nearly-complete * Idea from before: FIRST worry about near-complete; then worry about heap * Take the rightmost element on the last level and shove it at the root * Repeatedly swap down with the smaller of the subtrees * Time O(logn) How do we find "the righmost value on the lower level?" * Idea: Store our heaps differently! * Instead: Store our heaps in an array * Count the values in the heap (starting at 0) across each level. The "count" of a value is its index in the array * Add a new node at index "number of things" * How do you get the index of the subtree (for swapping down)? left subtree: 2*myindex + 1 right subtree: 2*myindex + 2 * How do you get the index of the supertree (for swapping up)? roundDown((myindex-1)/2) * Sorting algorithm: * Shove everything into a heap * Read it off in order * Cost O(nlogn) Clever hack: It is possible to turn an array into a heap "in place" Friday: Dictionaries (like arrays, but indexed by strings)