Skip to main content

Lab: Merge sort

Held
Friday, 3 May 2019
Writeup due
Monday, 6 May 2019
Summary
In this laboratory, we consider merge sort, a more efficient technique for sorting lists of values.

Exercise 0: Preparation

a. Make a copy of mergesort-lab.rkt, the code for this lab.

b. Discuss the self checks from the corresponding reading.

Exercises

Exercise 1: Reflections on merging

a. What will happen if you call merge with unsorted lists as the two list parameters?

b. Check your answer by experimentation. To help you understand what is happening, you may wish to modify merge so that it displays the values of sorted1 and sorted2.

c. What will happen if you call merge with sorted lists of very different lengths as the first two parameters?

d. Check your answer by experimentation.

e. How many times will merge be called in computing the answer in the following expression?

> (merge '(1 2 3 4) '(5 6 7 8) <=)

If you haven’t already added a line to display the steps, you should probably add one.

(define merge
  (lambda (sorted1 sorted2 may-precede?)
    (display (list 'merge sorted1 sorted2)) (newline)
    (cond
      ...)))

f. How many times will merge be called in computing the answer in the following expression?

> (merge '(1 3 5 7) '(2 4 6 8) <=)

Exercise 2: Sorting

a. Uncomment the following line in the repeat-merge helper in new-merge-sort.

                 (write list-of-lists) (newline)

b. What output do you expect to get if you run your updated new-merge-sort on the list from the reading’s self-check 3, step b?

c. Check your answer experimentally.

d. Rerun new-merge-sort on a list of twenty integers.

Exercise 3: Special Cases

As we’ve seen, in exploring any algorithm, it’s a good idea to check a few special cases that might cause the algorithm difficulty. Here are some to start with.

a. Run both versions of merge sort on the empty list.

b. Run both versions of merge sort on a one-element list.

c. Run both versions of merge sort on a list with duplicate elements.

Exercise 4: Steps in merge sort

We’ve claimed that merge sort takes approximately n*log_2(n) steps. Let’s explore that claim experimentally. We’ll focus on the number of calls to may-precede?.

a. Remind yourself of the uses of counter-new, counter-increment!, counter-reset!, and counter-print, all of which appeared in the reading on analyzing procedures.

b. Define a counter named may-precede-counter and a counter called merge-counter.

c. Add the following function to your definitions pane.

(define int-may-precede?
  (lambda (left right)
    (counter-increment! may-precede-counter)
    (<= left right)))

d. Add the following line to the beginning of the merge procedure.

  (counter-increment! merge-counter)

e. Add the following helpful procedure to your definitions.

(define experiment
  (lambda (lst)
    (counter-reset! may-precede-counter)
    (counter-reset! merge-counter)
    (let ([result (merge-sort lst int-may-precede?)])
      (counter-print may-precede-counter)
      (counter-print merge-counter)
      result)))

f. Using these counters, count the number of calls to merge and may-precede? in sorting a few lists of size 8, 16, 32, and 64. (Try a few lists of each size. You should use random-numbers to generate the lists.)

g. Is the number of calls to merge similar or different for different lists of the same size? Is the number of calls to may-precede? similar or different for different lists of the same size? What explains the similarities or differences?

h. Does the running time seem to grow faster than n? (In such functions, when you double the input size, you double the number of steps.) Does the running time seem to grow slower than n*n? (In such functions, when you double the input size, you should approximately quadruple the number of steps.) Which does it seem closer to?

Exercise 5: Is it sorted?

As you’ve probably noticed, there are two key postconditions of a procedure that sorts lists: The result is a permutation of the original list and the result is sorted.

Checking whether two lists are permutations is relatively difficult. So let’s focus on the other postconditions. We need a way to make sure that the result is sorted, particularly if the result is very long.

Write a procedure, (sorted? lst may-precede?) that checks whether or not lst is sorted by may-precede?.

For example,

> (sorted? (list 1 3 5 7 9) <=)
#t
> (sorted? (list 1 3 5 4 7 9) <=)
#f
> (sorted? (list "alpha" "beta" "gamma") string-ci<=?)
#t

Exercise 6: More complex merges

Assume that we represent names as lists of the form '(last-name first-name). Write an expression to merge the following two lists:

(define mathstats-faculty
  (list (list "Blanchard" "Jeff")
        (list "Chamberland" "Marc")
        (list "Fellers" "Pamela")
        (list "French" "Chris")
        (list "Jonkman" "Jeff")
        (list "Kuiper" "Shonda")
        (list "Mileti" "Joseph")
        (list "Moore" "Emily")
        (list "Moore" "Tom")
        (list "Olsen" "Chris")
        (list "Paulhus" "Jennifer")
        (list "Shuman" "Karen")
        (list "Wolf" "Royce")))

(define more-faculty
  (list (list "Moore" "Chuck")
        (list "Moore" "Ed")
        (list "Moore" "Gordon")
        (list "Moore" "Roger")))

For those with extra time

Extra 1: Splitting

Some computer scientists prefer to define split something like the following.

(define split
  (lambda (ls)
    (let kernel ([rest ls]
                 [left null]
                 [right null])
      (if (null? rest)
          (list left right)
          (kernel (cdr rest) (cons (car rest) right) left)))))

a. How does this procedure split the list?

b. Why might you prefer one version of split over the other?

Extra 2: Fixing new-merge-sort

You may recall that new-merge-sort fails on the empty list. Update the procedure so that it works correctly when given the empty list as input.

Extra 3: Checking permutations

We’ve written a procedure that checks whether a list is sorted, which is one of the postconditions of a list-based sorting routine. However, we have not yet written a procedure to check whether two lists are permutations of each other.

Write a procedure, (permutation? lst1 lst2), that determines if lst2 is a permutation of lst1.

You might find the following strategy, which involves iterating through the first list, an appropriate strategy for testing for permutations.

  • Base case: If both lists are empty, the two lists are permutations of each other.

  • Base case: If the first list is empty and the second is not (i.e., it’s a pair), the two lists are not permutations of each other.

  • Base case: If the first list is nonempty and car of the first list is not contained in the second, the two lists are not permutations of each other.

  • Recursive case: If the first list is not empty and the first element of the first list is in the second list, then the two lists are permutations of each other only if the cdr of the first list is a permutation of what you get by removing the car of the first list from the second list.

Note that this process is relatively expensive, since it likely takes about n steps to remove an element from a list, and we will be removing n elements. However, we will only be using this procedure for testing our results, so we will accept the potentially inefficiency for the time being.

In implementing the permutation? predicate, you may find the following two procedures useful.

;;; Procedure:
;;;   list-contains?
;;; Parameters:
;;;   lst, a list
;;;   val, a value
;;; Purpose:
;;;   Determines if lst contains val.
;;; Produces:
;;;   contained?, a Boolean
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   If there is an i such that (list-ref lst i) equals val,
;;;     then contained? is true (#t).
;;;   Otherwise,
;;;     contained? is false.
(define list-contains?
  (lambda (lst val)
    (and (not (null? lst))
         (or (equal? (car lst) val)
             (list-contains? (cdr lst) val)))))

;;; Procedure:
;;;   list-remove-one
;;; Parameters:
;;;   lst, a list
;;;   val, a value
;;; Purpose:
;;;   Remove one copy of val from lst.
;;; Produces:
;;;   newlst
;;; Preconditions:
;;;   lst contains at least one value equal to val.
;;; Postconditions:
;;;   * (length newlst) = (length lst) - 1
;;;   * lst is a permutation of (cons val newlst).
;;;   * Ordering is preserved.  That is, if val1 and val2 appear in 
;;;     both lst and newlst, and val1 precedes val2 in lst, then 
;;;     val1 precedes val2 in newlst.
(define list-remove-one
  (lambda (lst val)
    (cond
      [(null? lst) 
       null]
      [(equal? val (car lst)) 
       (cdr lst)]
      [else 
       (cons (car lst) 
             (list-remove-one (cdr lst) val))])))