- Held
- Wednesday, 24 April 2019
- Writeup due
- Friday, 26 April 2019
- Summary
- In this laboratory, we explore different issues related to searching.

Make a copy of ** binary-search-lab.rkt**,
the code for this lab.

It is often useful when exploring a recursive algorithm to observe the steps the algorithm performs. In Scheme, we can sometimes observe steps in recursive calls by using the debugger.

a. Add a breakpoint at the beginning of the kernel of binary
search. (The `if`

at the start of `search-portion`

).

b. Redo steps b-h of Check 3 from the reading and report on the number of calls to the kernel completed for each search.

c. Revisit your answer to question i of check 3 with your partner.

a. What do you expect binary search to do if there are entries with duplicate keys?

b. Create a copy of `simulated-students`

, using a name like
`simulated-students-with-duplicates`

.

c. Add two more entries with a key of `"Otto"`

and two
more entries with a key of `"Amy"`

to the new vector. These new
entries should be slightly different, so that you can tell them
apart. Make sure you keep your vector in sorted order.

d. Which of the three entries do you expect binary search to return if
you search for `"Otto"`

?

e. Check your answer experimentally.

f. Which of the three entries do you expect binary search to return if
you search for `"Amy"`

?

g. Check your answer experimentally.

h. What does your experience in this exercise suggest about what binary search will do with multiple keys?

a. As you may have observed, `simulated-students-by-id`

contains the same
entries as in `simulated-students`

, but with the students organized by
their id, rather than by name.

Write an expression to find a student with an id of 1658200

```
> (binary-search simulated-students-by-id ___ ___ 1658200)
```

b. As you may have observed, `simulated-students-by-year`

contains the same
entries as in `simulated-students`

, but with the students organized by
their year, rather than by name.

Write an expression to find a student with a graduation year of 2020

```
> (binary-search simulated-students-by-year ___ ___ 2020)
```

c. Write an expression to find a student with a graduation year of 2019.

It is sometimes useful to learn not just that something is not in
the vector, but where it would fall if it were in the vector. Write
documentation and code for `new-binary-search`

that mostly behaves like
`binary-search`

, except that it returns a “half value” if the value
being searched for belongs between two neighboring values. For example,
if the key being searched for is larger than the key at position 5 and
smaller than the key at position 6, you should return 5.5. Similarly,
if the key being searched for is smaller than the key at position 0,
you should return -0.5. If the key being searched for is bigger than
the largest key, return `(- (vector-length vec) 0.5)`

.

For example,

```
> (new-binary-search simulated-students car string<=? "Andy")
0.5
> (new-binary-search simulated-students car string<=? "Greg")
7
> (new-binary-search simulated-students car string<=? "Heather")
8
> (new-binary-search simulated-students car string<=? "Hanna")
7.5
```

You should use the code for `binary-search`

as your starting point.

*Note:* If you find it more natural to represent the indices as exact
numbers, rather than inexact numbers, you should feel free to do so.

For example,

```
> (new-binary-search simulated-students car string<=? "Aa")
-1/2
> (new-binary-search simulated-students car string<=? "Andy")
1/2
```

Here are commands to search `simulated-students`

for `"Paula"`

and
`simulated-students-by-id`

for something with id 1658200

```
> (binary-search simulated-students car string<=? "Paula")
> (binary-search simulated-students-by-id caddr <= 1658200)
```

a. What do you expect to have happen if we swap the vectors in these commands, as in the following?

```
> (binary-search simulated-students-by-id car string<=? "Paula")
> (binary-search simulated-students caddr <= 1658200)
```

b. Check your answer experimentally.

c. Try searching `simulated-students-by-id`

for the names
`"Otto"`

, `"Erin"`

, `"Fred"`

, `"Charlotte"`

, `"Janet"`

, and `"Xerxes"`

.

d. What do the previous steps of this experiment tell you about binary search?

As you may recall from Exercise 3, when searching for students by year, we got only one of a variety of objects with a particular graduation year. In some cases, it’s useful to find not just one student with that graduation year, but how many have that graduation year, or who have a graduation year or less..

a. One technique for finding out how many students have a certain
graduation year or less is to step through the vector until we find the
first student with a graduation year greater than the desired year. Using
this technique, write a procedure, ```
(graduates-up-to students-by-year
year)
```

that, given a vector of students sorted by graduation year, finds
the number of students who will graduate before or in the given year.

b. A more efficient way to find the position is to use a variant of
binary search. Once again, our goal is to find the first object with a
year greater than the desired year. However, this time we use binary
search to find that object. That is, you look in the middle. If the
middle object is a later year than the desired year, recurse on the left
half, but also include the middle element since that may be the one we’re
looking for. If the middle object is not later than the desired year,
recurse on the right half. Write a new version of `graduates-up-to`

that uses this technique.

*Note:* For part b, note that you cannot call the
binary-search procedure directly. Rather, you will need to use it as
a *template* for your code. That is, copy the procedure and then make
modifications as appropriate.

As you’ve observed, when a key is repeated, binary search picks
one value with that key, but not necessarily the first value with
that key. We might want to write a variant, `newer-binary-search`

,
that uses the ideas of binary search to find the *first* element in the
vector that contains the given key.

a. One strategy for implementing `newer-binary-search`

is to find some
value with the given key (which binary search already does) and then to
step left in the vector until you find the first value with that key.
Implement `newer-binary-search`

using that strategy.

b. Of course, we use the binary search technique so that we don’t have
to step through elements one-by-one. Rewrite `newer-binary-search`

so that it continues to “divide and conquer” in its attempt to find the
first element with that key.