CSC152 2004F, Class 38: Dictionaries (2): Binary Search Trees Admin: * Cool talk Wednesday! * Project discussion tomorrow. * Exam discussion delayed until Friday. * Cool convo (and other seminars): Incarceration Overview: * Detour: Binary Search Revisited (with an ADT) * Implementing the BinarySearchable ADT * An alternative: Search trees "Get something that's in the middle" "Look at the left or the right" * "Try to avoid looking at everything to the left and then everything to the right" * Look in the middle. If it's bigger than what we're looking for, look to the left. If it's smaller, look to the right. If it's the same, we're done. * To express this "smaller, greater, equal", we use a Comparator. * Possible return values: * The matching object [Today] * The index of the matching object [Exam] * True or false [Outline] * Two strategies for "look in left and right" * Recursively * Loop (for or while loop) * Concern when you recurse: How do you build "the new array that only goes up to the middle"? (And should you?) n/2 + n/4 + n/8 + .... = n* (1/2 + 1/4 + 1/8 + .... + 1/2^n) = O(n) * But binary search should be logn public Object binarySearch(Object[] values, Object findMe, Comparator c) { return binarySearch(values, findMe, c, 0, values.length-1); } private Object binarySearch(Object[] values, Object findMe, Comparator c, int lb, int ub) { int m = (lb + ub) / 2; Object mid = values[m]; if (c.compare(findMe, mid) < 0) return binarySearch(values, findMe, c, lb, m-1); else if (c.compare(findMe, mid) > 0) return binarySearch(values, findMe, c, m+1, ub); else return mid; } // binarySearch(Object[], Object) * How would we check whether or not this works correctly? * Give it some values and see whether it gives the right result. * Also check that it has the correct running time. * For each array size between 0 and 32 * Fill the array with the values 0, 2, .... 2*(size-1) * Search for the values -1 to 2*size-1 // Every value in the array and every value that falls between two neighbors * Verify the return value * This much testing gives you more assurance Observation: When you are using a structure (in this case, an array) in a certain way, generalize to an ADT. Challenge: Let's design an ADT for "BinarySearchable" * Philosophy: Things which we can binary search. * Methods * getMiddle() * getSmallerThanMiddle() * getLargerThanMiddle() * Applications: Binary search private Object binarySearch(BinarySearchable bs, Object findMe, Comparator c) { Object mid = bs.getMiddle(); if (c.compare(findMe, mid) < 0) return binarySearch(bs.getSmallerThanMiddle(), findMe, c) else if (c.compare(findMe, mid) > 0) return binarySearch(bs.getLargerThanMiddle(), findMe, c) else return mid; } // binarySearch(Object[], Object) * Implementations * As a sorted array with a lb and ub * Why choose this design of binary search rather than the earlier design? * We can choose a different implementation of the things we search * We get rid of lb and ub in the binary search code, which makes it clearer /Implementing Dictionaries with BinarySearchable/ /adfasddf asdfasdfa/ * Need to think about find, add, and remove * find is straightforward: * Store key/value pairs in the BinarySearchable * Search by key * Return the value part of the pair * If we use sorted arrays, add and remove are probably O(n) /Alternate Implementation/ * Use trees! Divide and conquer the data structure. * The root is the middle element * The left subtree contains the smaller elements * The right subtree contains the larger elements * The subtrees are organized in the same way * Addition: Follow the tree down and add when you run off the tree * Removal: Shove the rightmost thing in the left subtree (or the leftmost thing in the right subtree) in the place of the deleted element * Problem: Keeping it balanced: * Wait until 301 * When you notice severe imbalance, spend the O(n) time to rebalance * Running time * find: O(logn) * add: O(logn) * remove: O(logn) * Significantly better than a sorted array for add and remove; comparable for find What's the difference between a binary search tree and a heap? * Root value * Heaps put the "smallest" value at the root * Binary search trees put the "middle" value at the root * Structure * Heaps are guaranteed to be nearly-complete * Binary search trees are "hopefully balanced" Can we do better than balanced binary search trees? * Yes, if we rely on luck in different ways * Wednesday's topic * find: O(1) * add: O(1)