CSC152 2005F, Class 44: Priority Queues, Heaps (and maybe Heapsort) Admin: * Still thinking about what to do with those of you with significant problems. Take advantage of Eryn's office hours! (Visit me, too.) * Donuts! * Please fill out Eryn's survey! Overview: * Priority queues, revisited * Basic implementations * About trees * Heaps * Heap operations Priority queue: * Linear structure (put and get) * Get returns "smallest remaining element" * In a priority queue of numbers, numerically smallest * Generalize: Use a comparator to determine order A Vector Based Implementation: * How values are stored in the vector * No particular order to values. * How put works * Searches for a null (see get for why it might finds one) * If no null, adds at end * O(N) * Just putting at the end is (we can pretend) O(1) * How get works * Search for smallest * Put a null there * But too many nulls may slow things down * Sam's improved strategy: Instead of putting a null, move the *last* value into the place of the smallest * O(N) A Node-Based Implementation * How values are stored * Just an ordinary bunch of nodes, keep track of front and back * Each node contains one value. * Values are stored from smallest to largest. * How put work * Search for the correct place - binary search is O(log_2(n)) * Unfortunately, you can't binary search a sequence of nodes * Searching is therefore O(N) * Insert between nodes O(1) * How get works * Remove and return the first (see the NodeBasedQueue code for details) * O(1) Both Tree-Based Implementations * How values are stored * Each Node has Value, left, and right * Smaller values go to the left, Larger values go to the right * How put works * Look at the value in the current node. If the value you're putting is smaller, put into the left subtree (recursively). If the value you're putting is larger, put it into the right subtree (recursively). * O(# of levels in the tree) * Probably O(N) * If the tree is "balanced", perhaps better * How get works * Search to the left until there is no left subtree, return what's there (and update the tree somehow) * Get takes about the same time as put Trees are probably a good idea because they support "divide and conquer". * Good algorithm and DS design strategy Problem: How do we make sure that the tree is balanced? Step one: Define what we mean * A tree is "complete" if all the nodes at every level but the last have two subtrees, and all the nodes at the last level have 0 What can we say about the number of "levels" in a complete tree vs. the number of nodes. Number of nodes = 2^#levels #levels = log_2(# of nodes) In "nearly complete" trees, every node has either 2 or 0 subtrees, except for the penultimate level, in which things are shoved to the left. pretty much the same relationship between levels and #nodes #levels is in O(log_2(# of nodes)) The "heap property": The value stored in a node is less than or equal to the values stored in either subtree. The "Heap": A nearly-complete tree with the heap property. Implementing put and get (get returns smallest) Get: Grab the smallest value from the top. * To maintain near-completeness, must also get rid of the last thing on the last level * Put that thing at the "hole" in the top * Repeatedly "swap down" with the smaller of the children (provided the smaller of the childern is smaller) * O(log_2(n)) Put * To maintain near completeness, must add as the last node on the last level * To restore the heap property, "swap up", (swap with parent as long as smaller) * O(log_2(n)) We need to keep track of how many things are in the heap to figure out where we add or remove something How do we really figure out where the next value goes? use an array * Number nodes across levels * Node number i goes in position i of the array The left child of node i is at position (2*i + 1) The right child of node i is at position (2*i + 2) The parent of node i is at position floor((i-1)/2)