Computer Science Fundamentals (CS153 2003S)

Quicksort

Summary: We consider Quicksort, another divide-and-conquer algorithm for sorting. Unlike merge sort, which is guaranteed to be in O(nlog2n), Quicksort is only likely to have running time of O(nlog2n).

Contents:

Introduction

As our lists and arrays get increasingly large, it becomse more and more important to use the fastest possible sorting algorithm. As we've seen in our study of binary search and merge sort, one algorithm design strategy that often helps in the creation of fast algorithms is that of divide-and-conquer, in which you divide the input into two halves and then work on each or both halves.

In the merge sort algorithm, we divided the list or array in a fairly simple way; we just made sure each half had about the same number of elements. However, there are certainly other ways we could divide our collection of elements. For example, we might divide them into the smaller elements and the larger elements. That idea is at the core of Quicksort.

Overview

How does splitting the input into large and small values help? If we sort the small values and the large values, we can simply join the sorted collections. For lists, we might write something like

(define quicksort
  (lambda (lst)
    (if (or (null? lst) (null? (cdr lst)))
        lst
        (let ((small ...)
              (large ...))
          (append (quicksort small) (quicksort large))))))

Splitting the Collection

How do we go about identifying the large and the small values? That's the key problem in keysort. You may find the answer somewhat surprising: we pick some value (which we call the pivot) from the collection and assume that it's the average value. That is, everything smaller than the pivot is small and everything larger than the pivot is large.

How do we select the pivot? It turns out that it works fine to select it randomly. On average, you'll select a pretty good pivot.

One potential problem is that you may never shrink the parameter for recursive calls (as is a requirement for recursive design). For example, if lst contains multiple copies of one value (and only multiple copies of that value), either small or large will always contain the same sequence of values. Hence, we really need to split lst into three parts: things smaller than the pivot, things equal to the pivot, and things greater than the pivot. We only recurse on the smaller and greater sublists.

Putting it All Together

For our full implementation, we need to consider not just the three-way split described above, but also appropriate generalization (e.g., taking the procedure to compare values as a parameter). We use two comparators so that we can split in three parts.

;;; Procedure:
;;;   quicksort
;;; Parameters:
;;;   lst, a list to sort
;;;   comes-before?, a binary predicate that compares values.
;;;   same?, a binary predicate that compares values.
;;; Purpose:
;;;   Sort stuff.
;;; Produces:
;;;   sorted-stuff, a sorted list
;;; Preconditions:
;;;   comes-before? can be applied to any two elements of stuff.
;;;   comes-before? represents a transitive operation.
;;;   same? can be applied to any two elements of stuff.
;;;   For any two values, a and b, exactly one of the following holds:
;;;    1. (comes-before? a b)
;;;    2. (same? a b)
;;;    3. (comes-before? b a)
;;; Postconditions:
;;;   sorted-stuff is sorted.  That is, any element may precede the 
;;;     subsequent element.  In Scheme, we'd say that
;;;       ((disjunction comes-before? same?) (list-ref sorted i) 
;;;                                          (list-ref sorted (+ i 1)))
;;;   sorted-stuff is a permutation of stuff.
;;;   Does not affect stuff.
;;;   sorted-stuff may share cons cells with stuff.
;;; Examples:
;;;   To sort values, a list of numbers, in increasing order.
;;;     (quicksort values < =)
;;;   To sort values, a list of numbers, in decreasing order.
;;;     (quicksort values > =)
;;;   To sort people, a list of (last-name first-name phone-number)
;;;     triplets, by first name
;;;     (quicksort people 
;;;                (lambda (person1 person2) 
;;;                  (string<? (cadr person1) (cadr person2)))
;;;                (lambda (person1 person2) 
;;;                  (string=? (cadr person1) (cadr person2))))
(define quicksort
  (lambda (lst comes-before? same?)
    ; If there are only zero or one elements in the list,
    ; the list is already sorted.
    (if (or (null? lst) (null? (cdr lst)))
        lst
        ; Otherwise, 
        ;   Identify a pivot.
        ;   Build three lists (smaller, same, greater).
        ;   Sort smaller and greater.
        ;   Join the sorted parts.
        (let* ((pivot (select-pivot lst))
               (smaller (select (right-section comes-before? pivot) lst))
               (larger (select (left-section comes-before? pivot) lst))
               (same (select (left-section same? pivot) lst)))
          (append (quicksort smaller comes-before? same?) 
                  same 
                  (quicksort larger comes-before? same?))))))

You'll have an opportunity to explore further details in the lab.

Quicksorting Arrays

It is also possible to Quicksort an array in place. We'll return to this technique later.

 

History

Thursday, 27 February 2003 [Samuel A. Rebelsky]

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue May 6 09:21:40 2003.
The source to the document was last modified on Thu Feb 27 22:12:14 2003.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS153/2003S/Readings/quicksort.html.

You may wish to validate this document's HTML ; Valid CSS! ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu