Laboratory: Merge Sort

Summary: In this laboratory, we consider merge sort, a more efficient technique for sorting lists of values.

Exercise 0: Preparation

Make a copy of mergesort-lab.scm, the code for this lab.

Exercises

Exercise 1: Merging

a. Write an expression to merge the lists (1 2 3) and (1 1.5 2.3).

b. Write an expression to merge two identical lists of numbers. For example, you might merge the lists (1 2 3 5 8 13 21) and (1 2 3 5 8 13 21)

c. Write an expression to merge two lists of strings. (You may choose the strings yourself. Each list should have at least three elements.)

Exercise 2: Reflecting on Merging

a. What will happen if you call merge with unsorted lists as the two list parameters?

b. Check your answer by experimentation. To help you understand what is happening, you may wish to modify merge so that it displays the values of sorted1 and sorted2.

c. What will happen if you call merge with sorted lists of very different lengths as the first two parameters?

d. Check your answer by experimentation.

Exercise 3: Splitting

Use split to split:

a. A list of numbers of length 6

b. A list of numbers of length 5

c. A list of strings of length 6

Exercise 4: Sorting

a. Run merge-sort on a list you design of fifteen integers.

b. Run new-merge-sort on a list you design of ten strings.

c. Add the following lines to the repeat-merge helper in new-merge-sort. (Add them directly after the line that contains lambda.)

                 (display list-of-lists) (newline)

d. What output do you expect to get if you rerun new-merge-sort on the list from step b?

e. Check your answer experimentally.

f. Rerun new-merge-sort on a list of twenty integers.

Exercise 5: Special Cases

As we've seen, in exploring any algorithm, it's a good idea to check a few special cases that might cause the algorithm difficulty. Here are some to start with.

a. Run both versions of merge sort on the empty list.

b. Run both versions of merge sort on a one-element list.

c. Run both versions of merge sort on a list with duplicate elements.

Exercise 6: Steps in Merge Sort

We've claimed that merge sort takes approximately nlog₂n steps. Let's explore that claim experimentally. We'll focus on the number of calls to may-precede?.

a. Update the definitions of merge, split, merge-sort and new-merge-sort to use define$ rather than define.

b. Using analyze, count the number of calls to may-precede? in sorting a few lists of size 8, 16, 32, and 64. (Try a few lists of each size. You may find it helpful to use random-numbers to generate the lists.)

c. Are the number of calls to may-precede? similar or different for the same size list? What explains the similarities or differences?

d. Does the running time seem to grow slower than n²? (In such functions, when you double the input size, you should quadruple the number of steps.)

Exercise 7: Is It Sorted?

As you've probably noticed, there are two key postconditions of a procedure that sorts lists: The result is a permutation of the original list and the result is sorted.

We're fortunate that the unit test framework lets us test permutations (with test-permutation!). Hence, if we wanted to test merge sort in the unit test framework, we might write

(define some-list ...)
(test-permutation! (merge-sort some-list pred?) some-list)

We need a way to make sure that the result is sorted, particularly if the result is very long.

Write a procedure, (sorted? lst may-precede?) that checks whether or not lst is sorted by may-precede?.

For example,

> (sorted? (list 1 3 5 7 9) <=)
#t
> (sorted? (list 1 3 5 4 7 9) <=)
#f
> (sorted? (list "alpha" "beta" "gamma") string-ci<=?)

Note that we can use that procedure in a test suite for merge sort with

(test! (sorted? (merge-sort some-list may-precede?) may-precede?) #t)

For Those with Extra Time

Extra 1: More Complex Merges

Assume that we represent names as lists of the form (last-name first-name). Write an expression to merge the following two lists

(define mathstats-faculty
  (list (list "Chamberland" "Marc")
        (list "French" "Chris")
        (list "Jager" "Leah")
        (list "Kuiper" "Shonda")
        (list "Moore" "Emily")
        (list "Moore" "Tom")
        (list "Mosley" "Holly")
        (list "Romano" "David")
        (list "Shuman" "Karen")
        (list "Wolf" "Royce")))

(define more-faculty
  (list (list "Moore" "Chuck")
        (list "Moore" "Ed")
        (list "Moore" "Gordon")
        (list "Moore" "Roger")))

Extra 2: Splitting, Revisited

Some computer scientists prefer to define split something like the following.

(define split
  (lambda (ls)
    (let kernel ((rest ls)
                 (left null)
                 (right null))
      (if (null? rest)
          (list left right)
          (kernel (cdr rest) (cons (car rest) right) left)))))

a. How does this procedure split the list?

b. Why might you prefer one version of split over the other?

Extra 3: Checking Permutations

We've written a procedure that checks whether a list is sorted, which is one of the postconditions of a list-based sorting routine. However, we have not yet written a procedure to check whether two lists are permutations of each other.

Write a procedure, (permutation? lst1 lst2), that determines if lst2 is a permutation of lst1.

You might find the following strategy, which involves iterating through the first list, an appropriate strategy for testing for permutations.

Base case: If both lists are empty, the two lists are permutations of each other.

Base case: If the first list is empty and the second is not (i.e., it's a pair), the two lists are not permutations of each other.

Base case: If the first list is nonempty and car of the first list is not contained in the second, the two lists are not permutations of each other.

Recursive case: If the first list is not empty and the first element of the first list is in the second list, then the two lists are permutations of each other only if the cdr of the first list is a permutation of what you get by removing the car of the first list from the second list.

You may also find the following two procedures useful.

;;; Procedure:
;;;   list-contains?
;;; Parameters:
;;;   lst, a list
;;;   val, a value
;;; Purpose:
;;;   Determines if lst contains val.
;;; Produces:
;;;   contained?, a Boolean
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   If there is an i such that (list-ref lst i) equals val,
;;;     then contained? is true (#t).
;;;   Otherwise,
;;;     contained? is false.
(define list-contains?
  (lambda (lst val)
    (and (not (null? lst))
         (or (equal? (car lst) val)
             (list-contains? (cdr lst) val)))))

;;; Procedure:
;;;   list-remove-one
;;; Parameters:
;;;   lst, a list
;;;   val, a value
;;; Purpose:
;;;   Remove one copy of val from lst.
;;; Produces:
;;;   newlst
;;; Preconditions:
;;;   lst contains at least one value equal to val.
;;; Postconditions:
;;;   (length newlst) = (length lst) - 1
;;;   lst is a permutation of (cons val newlst).
;;;   Ordering is preserved.  That is, if val1 and val2 appear in 
;;;     both lst and newlst, and val1 precedes val2 in lst, then 
;;;     val1 precedes val2 in newlst.
(define list-remove-one
  (lambda (lst val)
    (cond
      ((null? lst) 
       null)
      ((equal? val (car lst)) 
       (cdr lst))
      (else 
       (cons (car lst) (list-remove-one (cdr lst) val))))))