Skip to main content

Lab: Binary search

Held
Wednesday, 24 April 2019
Writeup due
Friday, 26 April 2019
Summary
In this laboratory, we explore different issues related to searching.

Exercise 0: Preparation

Make a copy of binary-search-lab.rkt, the code for this lab.

Exercises

Exercise 1: Counting recursive calls

It is often useful when exploring a recursive algorithm to observe the steps the algorithm performs. In Scheme, we can sometimes observe steps in recursive calls by using the debugger.

a. Add a breakpoint at the beginning of the kernel of binary search. (The if at the start of search-portion).

b. Redo steps b-h of Check 3 from the reading and report on the number of calls to the kernel completed for each search.

c. Revisit your answer to question i of check 3 with your partner.

Exercise 2: Duplicate keys

a. What do you expect binary search to do if there are entries with duplicate keys?

b. Create a copy of simulated-students, using a name like simulated-students-with-duplicates.

c. Add two more entries with a key of "Otto" and two more entries with a key of "Amy" to the new vector. These new entries should be slightly different, so that you can tell them apart. Make sure you keep your vector in sorted order.

d. Which of the three entries do you expect binary search to return if you search for "Otto"?

e. Check your answer experimentally.

f. Which of the three entries do you expect binary search to return if you search for "Amy"?

g. Check your answer experimentally.

h. What does your experience in this exercise suggest about what binary search will do with multiple keys?

Exercise 3: Searching by other values

a. As you may have observed, simulated-students-by-id contains the same entries as in simulated-students, but with the students organized by their id, rather than by name.

Write an expression to find a student with an id of 1658200

> (binary-search simulated-students-by-id ___ ___ 1658200)

b. As you may have observed, simulated-students-by-year contains the same entries as in simulated-students, but with the students organized by their year, rather than by name.

Write an expression to find a student with a graduation year of 2020

> (binary-search simulated-students-by-year ___ ___ 2020)

c. Write an expression to find a student with a graduation year of 2019.

Exercise 4: Binary search, revisited

It is sometimes useful to learn not just that something is not in the vector, but where it would fall if it were in the vector. Write documentation and code for new-binary-search that mostly behaves like binary-search, except that it returns a “half value” if the value being searched for belongs between two neighboring values. For example, if the key being searched for is larger than the key at position 5 and smaller than the key at position 6, you should return 5.5. Similarly, if the key being searched for is smaller than the key at position 0, you should return -0.5. If the key being searched for is bigger than the largest key, return (- (vector-length vec) 0.5).

For example,

> (new-binary-search simulated-students car string<=? "Andy")
0.5
> (new-binary-search simulated-students car string<=? "Greg")
7
> (new-binary-search simulated-students car string<=? "Heather")
8
> (new-binary-search simulated-students car string<=? "Hanna")
7.5

You should use the code for binary-search as your starting point.

Note: If you find it more natural to represent the indices as exact numbers, rather than inexact numbers, you should feel free to do so.

For example,

> (new-binary-search simulated-students car string<=? "Aa")
-1/2
> (new-binary-search simulated-students car string<=? "Andy")
1/2

Exercise 5: Searching alternate vectors

Here are commands to search simulated-students for "Paula" and simulated-students-by-id for something with id 1658200

> (binary-search simulated-students car string<=? "Paula")
> (binary-search simulated-students-by-id caddr <= 1658200)

a. What do you expect to have happen if we swap the vectors in these commands, as in the following?

>  (binary-search simulated-students-by-id car string<=? "Paula")
>  (binary-search simulated-students caddr <= 1658200)

b. Check your answer experimentally.

c. Try searching simulated-students-by-id for the names "Otto", "Erin", "Fred", "Charlotte", "Janet", and "Xerxes".

d. What do the previous steps of this experiment tell you about binary search?

For those with extra time

Extra 1: Counting year

As you may recall from Exercise 3, when searching for students by year, we got only one of a variety of objects with a particular graduation year. In some cases, it’s useful to find not just one student with that graduation year, but how many have that graduation year, or who have a graduation year or less..

a. One technique for finding out how many students have a certain graduation year or less is to step through the vector until we find the first student with a graduation year greater than the desired year. Using this technique, write a procedure, (graduates-up-to students-by-year year) that, given a vector of students sorted by graduation year, finds the number of students who will graduate before or in the given year.

b. A more efficient way to find the position is to use a variant of binary search. Once again, our goal is to find the first object with a year greater than the desired year. However, this time we use binary search to find that object. That is, you look in the middle. If the middle object is a later year than the desired year, recurse on the left half, but also include the middle element since that may be the one we’re looking for. If the middle object is not later than the desired year, recurse on the right half. Write a new version of graduates-up-to that uses this technique.

Note: For part b, note that you cannot call the binary-search procedure directly. Rather, you will need to use it as a template for your code. That is, copy the procedure and then make modifications as appropriate.

Extra 2: Finding the first element with a particular key

As you’ve observed, when a key is repeated, binary search picks one value with that key, but not necessarily the first value with that key. We might want to write a variant, newer-binary-search, that uses the ideas of binary search to find the first element in the vector that contains the given key.

a. One strategy for implementing newer-binary-search is to find some value with the given key (which binary search already does) and then to step left in the vector until you find the first value with that key. Implement newer-binary-search using that strategy.

b. Of course, we use the binary search technique so that we don’t have to step through elements one-by-one. Rewrite newer-binary-search so that it continues to “divide and conquer” in its attempt to find the first element with that key.