CSC-152 2000F : Class 30: Quicksort

Held Monday, October 23, 2000

Summary

Today we visit our final sorting algorithm (for the time being), Quicksort. Quicksort is a divide-and-conquer algorithm based on the notion of dividing the set into small values and large values.

Notes

Are there questions on exam 2? I worked on much of the exam over break and didn't find any significant problems.

I decided to spend much of break in work avoidance mode, so I don't have your homeworks graded yet. The 152 HW is third in my grading stack, so I'm not sure when I'll have it graded.

I did spend much of break sorting my CD collection, so I got to apply many of our favorite sorting and searching algorithms.

Does anyone need some empty jewel boxes?

I've also thought about the project. In the past, I've done the project in a much larger class. I'll give you the chance to give up on the project, if you'd like. [Email vote.]

Overview

Merge sort, revisited

Quick sort, sketched

Partitioning

Limits of compare-and-swap sorting

Other sorting techniques

Quicksort

Is it possible to write an O(n*log₂n) sorting algorithm that is based on comparing and swapping, but doesn't require significantly extra space?

Yes, if you're willing to rely on probabilities.

In the Quicksort algorithm, you split (partition) the array to be sorted into two pieces, those smaller than or equal to the pivot and those greater than the pivot. You don't include the pivot in either piece (so that the recursive case is ``smaller'').

You can also split into three parts: those smaller than some middle element, those equal to some middle element, and those larger than some middle element.

With a little work, you can do this partitioning in place, so that there is no overhead (and so that ``glueing'' is basically a free operation).

(It is not necessarily okay to partition the array into two parts: those less than or equal to the element, and those greater than or equal to the element.)

/**
 * Sort an array using Quicksort.
 * Pre: All elements in the array can be compared to each other.
 * Post: The vector is sorted (using the standard meaning).
 */
public void quickSort(Object[] stuff) {
  quickSort(stuff, 0, stuff.length-1);
} // quickSort(Object[])

/**
 * Sort part of an array using Quicksort.
 * Pre: All elements in the subarray can be compared to each other.
 * Pre: 0 <= lb <= ub < stuff.length
 * Post: The vector is sorted (using the standard meaning).
 */
public void quickSort(Object[] stuff, int lb, int ub, Comparator compare) {
  // Variables
  Object pivot;	// The pivot used to split the vector
  int mid;		// The position of the pivot
  // Base case: size one arrays are sorted.
  if (lb == ub) return;
  // Pick a pivot and put it at the front of the array.  
  putPivotAtFront(stuff,lb,ub);
  // Determine the position of the pivot, while rearranging the array.
  mid = partition(stuff, lb, ub);
  // Recurse on nonempty subarrays.
  if (lb<=mid-1) quickSort(stuff, lb,mid-1);
  if (mid+1<=ub) quickSort(stuff, mid+1,ub);
} // quickSrt

/**
 * Split the array given by [lb .. ub] into ``smaller'' and
 * ``larger'' elements, where smaller and larger are defined by
 * their relationship to a pivot.  Return the index of the pivot 
 * between those elements.  Uses the first element of the array 
 * as the pivot.
 */
public int partition(Object[] stuff, int lb, int ub) {
  // STUB.  Can you figure it out?
  return 0;
} // partition(Object[])

What is the running time of Quicksort? It depends on how well we partition. If we partition into two equal halves, then we can say

Partitioning a vector of length n takes O(n) steps.
The time to partition those two partitions into four parts is also O(n).
If each partition is perfect (splits it exactly in half), we can stop the process after O(log_2(n)) levels.
This gives a running time of O(n*log_2(n)).

On average, we don't quite do half, but it's close enough that it doesn't make a significant difference.

However, bad choice of pivots can give significantly worse running time. If we always chose the largest element as the pivot, this algorithm would be equivalent to selection sort, and would take time O(n*n).

How do we partition? Typically, using something like the following strategy:

Set pivot to the first element of the subarray
Set left to the start of the subarray
Set right to the end of the subarray
Move left and right toward each other, 
  Swap their contents when you observe that one side is
    "wrong" (something on the left is larger than the pivot,
    something on the right is larger than the pivot)

How might you change Mergesort so that the pivot need not be an element of the array?

How might you rewrite Mergesort iteratively?

Better Sorting Techniques

Are there better sorting techniques (ones that take less than O(nlog₂n))? Not if you limit yourself to basic operations of comparing and swapping.

But if you're willing to use extra space and know something about the original data, then you can do better.

In bucket sort (the one Rob suggested), you create separate ``buckets'' for kinds of elements, put each element into the appropriate bucket, sort each bucket, and then take them out again.

If you can arrange things so that each bucket contains only a few elements (say no more than four), then the main cost is putting in to buckets and taking out of buckets.
Usually we choose simple criteria for the buckets, such as the first (or nth) letter in a string.
Running time can be a constant (as long as you can guarantee the number of items in any bucket)!

In radix sort, we sort using a binary representation of the things we're sorting. We'll do an example later.

History

Wednesday, 23 August 2000

Created as a blank outline.

Thursday, 24 August 2000

Slight reorganization to page design.

Monday, 24 October 2000

Filled in the details.

Back to More Efficient Sorting Algorithms. On to Introduction to Data Structures.

[Current] [News] [Glance] [Search] [Instructions] [Links] [Handouts] [Project] [Outlines] [Labs] [Homeworks] [Quizzes] [Exams] [Examples] [EIJ] [API]

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.

This page may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2000F/Outlines/outline.30.html

Source text last modified Wed Oct 25 10:05:37 2000.

This page generated on Fri Oct 27 08:20:02 2000 by Siteweaver. Validate this page's HTML.

Contact our webmaster at rebelsky@grinnell.edu

Class 30: Quicksort

Quicksort

Variations

Better Sorting Techniques

History