Merge sort
Summary: In this laboratory, we consider merge sort, a more efficient technique for sorting lists of values.
Exercise 0: Preparation
a. Make a copy of mergesort-lab.rkt, the code for this lab.
b. Discuss the self checks from the corresponding reading.
Exercises
Exercise 1: Reflections on merging
a. What will happen if you call merge with unsorted
lists as the two list parameters?
b. Check your answer by experimentation. To help you understand what is
happening, you may wish to modify merge so that it
displays the values of sorted1 and
sorted2.
c. What will happen if you call merge with sorted
lists of very different lengths as the first two parameters?
d. Check your answer by experimentation.
e. How many times will merge be called in computing the answer
in the following expression?
> (merge '(1 2 3 4) '(5 6 7 8) <=)
If you haven’t already added a line to display the steps, you should probably add one.
(define merge
(lambda (sorted1 sorted2 may-precede?)
(display (list 'merge sorted1 sorted2)) (newline)
(cond
...)))
f. How many times will merge be called in computing the answer
in the following expression?
> (merge '(1 3 5 7) '(2 4 6 8) <=)
Exercise 2: Sorting
a. Uncomment the following line in the repeat-merge
helper in new-merge-sort.
(write list-of-lists) (newline)
b. What output do you expect to get if you run your updated
new-merge-sort on the list from the reading’s self-check 3, step b?
c. Check your answer experimentally.
d. Rerun new-merge-sort on a list of twenty integers.
Exercise 3: Special Cases
As we’ve seen, in exploring any algorithm, it’s a good idea to check a few special cases that might cause the algorithm difficulty. Here are some to start with.
a. Run both versions of merge sort on the empty list.
b. Run both versions of merge sort on a one-element list.
c. Run both versions of merge sort on a list with duplicate elements.
Exercise 4: Steps in merge sort
We’ve claimed that merge sort takes approximately n*log_2(n) steps.
Let’s explore that claim experimentally. We’ll focus on the
number of calls to may-precede?.
a. Remind yourself of the uses of counter-new, counter-count!,
counter-reset!, and counter-print, all of which appeared in the
reading on analyzing procedures.
b. Define a counter named may-precede-counter and
a counter called merge-counter.
c. Add the following function to your definitions pane.
(define int-may-precede?
(lambda (left right)
(counter-increment! may-precede-counter)
(<= left right)))
d. Add the following line to the beginning of the
merge procedure.
(counter-increment! merge-counter)
e. Add the following helpful procedure to your definitions.
(define experiment
(lambda (lst)
(counter-reset! may-precede-counter)
(counter-reset! merge-counter)
(let ([result (merge-sort lst int-may-precede?)])
(counter-print may-precede-counter)
(counter-print merge-counter)
result)))
f. Using these counters, count the number of calls to merge and
may-precede? in sorting a few lists of size 8, 16, 32, and 64.
(Try a few lists of each size. You should use random-numbers to
generate the lists.)
g. Is the number of calls to merge similar or different for different
lists of the same size? Is the number of calls to may-precede? similar
or different for different lists of the same size? What explains the
similarities or differences?
h. Does the running time seem to grow faster than n? (In such
functions, when you double the input size, you double the number of
steps.) Does the running time seem to grow slower than n*n?
(In such functions, when you double the input size, you should
approximately quadruple the number of steps.) Which does it seem
closer to?
Exercise 5: Is it sorted?
As you’ve probably noticed, there are two key postconditions of a procedure that sorts lists: The result is a permutation of the original list and the result is sorted.
Checking whether two lists are permutations is relatively difficult. So let’s focus on the other postconditions. We need a way to make sure that the result is sorted, particularly if the result is very long.
Write a procedure, (sorted? lst may-precede?) that checks
whether or not lst is sorted by may-precede?.
For example,
> (sorted? (list 1 3 5 7 9) <=)
#t
> (sorted? (list 1 3 5 4 7 9) <=)
#f
> (sorted? (list "alpha" "beta" "gamma") string-ci<=?)
#t
Note that we can use that procedure in a test suite for merge sort with
Exercise 6: More complex merges
Assume that we represent names as lists of the form '(last-name
first-name). Write an expression to merge the following two lists:
(define mathstats-faculty
(list (list "Blanchard" "Jeff")
(list "Chamberland" "Marc")
(list "Fellers" "Pamela")
(list "French" "Chris")
(list "Jonkman" "Jeff")
(list "Kuiper" "Shonda")
(list "Mileti" "Joseph")
(list "Moore" "Emily")
(list "Moore" "Tom")
(list "Olsen" "Chris")
(list "Paulhus" "Jennifer")
(list "Shuman" "Karen")
(list "Wolf" "Royce")))
(define more-faculty
(list (list "Moore" "Chuck")
(list "Moore" "Ed")
(list "Moore" "Gordon")
(list "Moore" "Roger")))
For those with extra time
Extra 1: Splitting
Some computer scientists prefer to define split something
like the following.
(define split
(lambda (ls)
(let kernel ([rest ls]
[left null]
[right null])
(if (null? rest)
(list left right)
(kernel (cdr rest) (cons (car rest) right) left)))))
a. How does this procedure split the list?
b. Why might you prefer one version of split over the other?
Extra 2: Fixing new-merge-sort
You may recall that new-merge-sort fails on the empty list. Update
the procedure so that it works correctly when given the empty list as
input.
Extra 3: Checking permutations
We’ve written a procedure that checks whether a list is sorted, which is one of the postconditions of a list-based sorting routine. However, we have not yet written a procedure to check whether two lists are permutations of each other.
Write a procedure, (permutation? lst1 lst2), that
determines if lst2 is a permutation of lst1.
You might find the following strategy, which involves iterating through the first list, an appropriate strategy for testing for permutations.
-
Base case: If both lists are empty, the two lists are permutations of each other.
-
Base case: If the first list is empty and the second is not (i.e., it’s a pair), the two lists are not permutations of each other.
-
Base case: If the first list is nonempty and car of the first list is not contained in the second, the two lists are not permutations of each other.
-
Recursive case: If the first list is not empty and the first element of the first list is in the second list, then the two lists are permutations of each other only if the cdr of the first list is a permutation of what you get by removing the car of the first list from the second list.
Note that this process is relatively expensive, since it likely takes
about n steps to remove an element from a list, and we will be
removing n elements. However, we will only be using this procedure
for testing our results, so we will accept the potentially inefficiency
for the time being.
In implementing the permutation? predicate, you may find the following
two procedures useful.
;;; Procedure:
;;; list-contains?
;;; Parameters:
;;; lst, a list
;;; val, a value
;;; Purpose:
;;; Determines if lst contains val.
;;; Produces:
;;; contained?, a Boolean
;;; Preconditions:
;;; [No additional]
;;; Postconditions:
;;; If there is an i such that (list-ref lst i) equals val,
;;; then contained? is true (#t).
;;; Otherwise,
;;; contained? is false.
(define list-contains?
(lambda (lst val)
(and (not (null? lst))
(or (equal? (car lst) val)
(list-contains? (cdr lst) val)))))
;;; Procedure:
;;; list-remove-one
;;; Parameters:
;;; lst, a list
;;; val, a value
;;; Purpose:
;;; Remove one copy of val from lst.
;;; Produces:
;;; newlst
;;; Preconditions:
;;; lst contains at least one value equal to val.
;;; Postconditions:
;;; * (length newlst) = (length lst) - 1
;;; * lst is a permutation of (cons val newlst).
;;; * Ordering is preserved. That is, if val1 and val2 appear in
;;; both lst and newlst, and val1 precedes val2 in lst, then
;;; val1 precedes val2 in newlst.
(define list-remove-one
(lambda (lst val)
(cond
[(null? lst)
null]
[(equal? val (car lst))
(cdr lst)]
[else
(cons (car lst)
(list-remove-one (cdr lst) val))])))