# Class 28: Some Sorting Algorithms

Back to More Recursive Algorithms. On to More Efficient Sorting Algorithms.

Held Wednesday, October 11, 2000

Summary

Today we move on to algorithm design for a common problem: Putting the elements of a list or array in "order". This problem is called sorting.

Notes

• I still need to update the ``at a glance'' page. I hope to do so during break.
• Don't forget tomorrow's cool physics convo talk. Where else can you learn about the origin of the universe?
• Don't forget that I'd like to see more code for the project by Friday.
• Did anyone think about the question for today (how to make Fibonacci more efficient)?

Overview

• The problem of sorting
• Sorting by example
• Selection sort
• An object-oriented perspective
• Insertion sort
• Bubble sort

## Making Fibonacci Quicker

• The trick: Put values in a table, `FIB`.
• If `FIB[i]` is nonzero, it contains the nth Fibonacci number.
• Otherwise, we need to compute it.
• Here's the code (or some of it)
```public long fib(int n) {
if (FIB[n] != 0) {
return FIB[n];
}
else {
FIB[n] = fib(n-1) + fib(n-2);
return FIB[n];
}
} // fib(int)
```
• The analysis is hellish. So we'll rewrite it instead.
```public static long fib(int n) {
long[] FIB = new long[n+1];
FIB[1] = 1;
FIB[2] = 1;
for (int i = 3; i <= n; i++) {
FIB[i] = FIB[i-1] + FIB[i-2];
}
return FIB[n];
} // fib(int)
```
• Hmmm ... this seems to be O(n).

## The Problem of Sorting

• Typically, computer scientists look at collections of problems and attempt to find appropriate generalizations of these problems (or their subproblems).
• By solving the generalized problem, you solve a number of related problems.
• One problem that seems to crop up a lot is that of sorting. Given a list, array, vector, sequence, or file of comparable elements, put the elements in order.
• in order typically means that each element is no bigger than the next element. (You can also sort in decreasing order, in which case each element is no smaller than the next element.)
• you also need to ensure that all elements in the original list are in the sorted list.
• you also need to ensure that no other elements are in the list.
• We'll look at sorting arrays.
• You'll find that these sorting methods look somewhat different from those in the book. That's because we're working in a different context.
• In the book, the methods sort external arrays which are passed as parameters.
• In these notes, the methods sort arrays that are part of the same class.
• In evaluating sorting methods, we should concern ourselves with both the running time and the amount of extra storage (beyond the original vector) that is required.
• In place sorting is a special subclass of sorting algorithms in which the original object is modified, and little, if any, extra storage is used.
• Stable sorts preserve the order of equal elements. Stable sorting is necessary if you want to sort the same list multiple times, on different keys.
• For large enough data sets, not all of the elements can be stored in memory. Often, variant algorithms must be used in order to get more efficient operation.
• You may learn about such sorting algorithms in CSC302.
• Most often, in-memory sorting is accomplished by repeatedly swapping elements. However, this is not the only way in which sorting can be done.

### Sorting Examples

• In my experience, most of us know some sorting algorithms, but have difficulty generalizing them.
• I'll bring in some things to sort (most frequently, a stack of CDs) and we'll look at different ways to sort them.

### Insertion Sort

• One simple sorting technique is insertion sort.
• Insertion sort operates by segmenting the list into unsorted and sorted portions, and repeatedly removing the first element from the unsorted portion and inserting it into the correct place in the sorted portion.
• This may be likened to the way typical card players sort their hands.
• What's the running time?
• We do the find-place and insert steps O(n) times.
• Each insertion takes O(n) in an array.
• Each findPlace takes O(log2n) in a sorted array. (Using binary search.)
• Hence, the running time is O(n*(n+log2(n)).
• Simplifying, we get O(n2).
• What's the extra storage? It should be constant.
• How might we code this recursively?

## A Sortable Class

• One of the difficulties of object-oriented design is deciding what belongs where.
• Are the sorting methods methods of some kind of collection class (which seems reasonable: ``You there! Sort yourself!'')
• Are they applied to some kind of collection (which also seams reasonable: ``Carl and Carla, please sort this stuff'').
• We'll choose the first and build objects that know how to sort themselves. It should be a natural step to the second.
• But how do we compare the elements (a typical sorting method requires us to determine if object A comes before object B)?
• We use the Ordering class we defined on exam 1
```
import IncomparableException;

/**
* Objects used to compare other objects.  Based on version 1.2
* of the old Comparable.java by Sam Rebelsky.
*
* @author Samuel A. Rebelsky
* @version 1.0 of October 2000
*/
public interface Ordering {
/**
* Determine if the first element should precede the
* second.   Throws an exception if the two elements cannot
* be compared.
*/
public boolean precedes(Object first, Object second)
throws IncomparableException;

/**
* Determine if the first element is equal to the second.   Throws
* an exception if the two elements cannot be compared.
*/
public boolean equals(Object first, Object second)
throws IncomparableException;

} // interface Ordering

```
• (Note that Java now provides `Comparable` and `Comparator` interfaces that we might instead use.
• Here's the interface for objects that can sort themselves.
```
import java.util.Comparator;
import IncomparableException;

/**
* Things that you can tell to sort themselves.
*
* @author Samuel A. Rebelsky
* @version 1.2 of October 2000
*/
public interface Sortable {
/**
* Sort the contents of the thing using an ordering to determine
* the relative position of pairs of elements.
* Pre: The elements can be compared.
* Post: The elements are sorted.  Each element is no greater
*   the the next element.
* Post: No elements are added or deleted.
*
* @exception IncomprableException
*   if there are pairs of elements that cannot be compared.
*   (Yes, I know that Comparators throw ClassCastExceptions,
*   but I believe that's an atrocious idea.
*/
public void sort(Comparator compare)
throws IncomparableException;
} // interface Sortable

```
• Finally, here's a class which we might extend by adding a sort method.
```
/**
* A objects of values which can be indexed by integers from 0 to size()-1.
* Initially, a wrapper class for Java's built-in arrays, now partially
* extended to support subarrays.
*
* @author Samuel A. Rebelsky
* @version 1.0 of September 1999
*/
public class Array {
// +--------+--------------------------------------------------
// | Fields |
// +--------+

/** The elements of the objects. */
public Object[] objects;

// +--------------+--------------------------------------------
// | Constructors |
// +--------------+

/**
* Build a new array which holds up to n elements.
* Initially, each element is null.
*/
public Array(int n) {
objects = new Object[n];
} // Array(int)

/**
* Build a new array which holds the specified
* set of elements.
*/
public Array(Object[] elements) {
objects = new Object[elements.length];
for (int i = 0; i < elements.length; i++) {
objects[i] = elements[i];
}
} // Array(Object[])

/**
* Build a new array which holds the Integers that correspond
* to the values given in the array.
*/
public Array(int[] values) {
objects = new Object[values.length];
for (int i = 0; i < values.length; i++) {
objects[i] = new Integer(values[i]);
} // for
} // Array(int[])

/**
* Build a new subarray which holds the elements in positions
* start .. finish of stuff.  Modifications to the subarray should
* affect the original array and vice versa.
*/
public Array(Array stuff, int start, int finish) {
// STUB.  Write this.
} // Array(stuff, start, finish)

// +-----------+-----------------------------------------------
// | Accessors |
// +-----------+

/**
* Get the ith element of the array.
*
* @exception ArrayIndexOutOfBoundsException
*   For the obvious reasons.
*/
public Object get(int i) {
return objects[i];
} // get(int)

/**
* Get the number of elements in the array.
*/
public int size() {
return objects.length;
} // size()

// +-----------+-----------------------------------------------
// | Modifiers |
// +-----------+

/**
* Set the ith element of the array.
*
* @exception ArrayIndexOutOfBoundsException
*   For the obvious reasons.
*/
public void set(int i, Object value) {
objects[i] = value;
} // set(int, Object)

/**
* Swap the ith and jth elements of the array.
* Included because it's commonly used during sorting.
*/
public void swap(int i, int j) {
Object temp = objects[i];
objects[i] = objects[j];
objects[j] = temp;
} // swap(int,int)

} // class Array

```

## Common Sorting Algorithms

• Because sorting is such an important task, computer scientists (and normal people, too) have developed a number of techniques that are commonly used for sorting.
• You've already seen one: insertion sort.
• Let's consider two more.
• Selection sort is among the simpler and more natural methods for sorting.
• In this sorting algorithm, you segment the array into two subparts, a sorted part and an unsorted part. You repeatedly find the largest of the unsorted elements, and swap it into the beginning of the sorted part. This continues until there are no unsorted elements.
```+---+---+---+---+---+---+---+---+
|   |   |   |   |   |   |   |   |
+---+---+---+---+---+---+---+---+
|
Unsorted           Sorted
```
• What's the running time of this algorithm? To sort a vector of n elements, we have to find the largest element in that vector in O(n) steps, and then recurse on the rest. The first recursive call takes O(n-1) steps plus the recursion. And so on and so forth. This makes it an O(n2) algorithm.
• What's the extra memory required by this algorithm (ignoring the extra memory for recursive calls)? It's more or less O(1), since we only allocate a few extra variables and no extra vectors.
• How much extra memory is required for recursive method calls? This is a tail-recursive algorithm, so there shouldn't be any.
• Note that we can also write selection sort iteratively.

### Bubble Sort

• Bubble sort is a lot like both insertion sort and selection sort
• In bubble sort, you step all the way through the array, swapping adjacent elements if they're out of order.
• This ``bubbles'' the largest value to the end.
• Like selection sort, it puts the largest value at the end. It just uses a different technique.
• Like insertion sort, it repeatedly swaps neighboring elements when they're out of order. Unlike insertion sort, it goes all the way through the array.
• The running time is O(n2).

## Choosing a Sorting Method

• All three of these are O(n2). Which should we choose?
• In this case, it turns out that the constants do make a difference. Typically selection sort and insertion sort run much more quickly than does bubble sort.
• Would we ever want to use bubble sort? Yes.
• Sometimes we can only swap neighboring elements. Consider a situation in which you can only store two objects in memory, and the rest in a file. Here's how we might do one round of bubbling up.
```Open the input file
Open a temporary file for the "more sorted" version
Let largestSoFar = the first element in the input file
While (elements remain in the input file)
Let nextElement = the next element in the input file
If nextElement < largestSoFar then
Write nextElement to the temporary file
Else
Write largestSoFar to the temporary file
Set largestSoFar to nextElement
// We've read the whole file (N elements), but only written
// N-1 elements.  Write the last one.
Write largestSoFar to the temporary file
Close the input file
Close the temporary file
Replace the input file with the temporary file
```
• Are there other times we might want to use bubble sort? It turns out that bubble sort is nice on some parallel computers. You can swap a N/2 pairs of adjacent elements in one step
• Round 1: all the cells numbered 2*i swap with 2*i+1 (if out of order)
• Round 2: all the cells numbered 2*i swap with 2*i-1 (if out of in order)
• We'll try acting out this last sorting routine.

## Doing Better

• Can we do better than O(n2).
• It turns out there are O(n log2 n) sorting algorithms.
• What techniques might we apply to develop such algorithms?
• Stay tuned tomorrow for more details.

## History

Wednesday, 23 August 2000

• Created as a blank outline.

Thursday, 24 August 2000

• Slight reorganization to page design.

Wednesday, 11 October 2000

Back to More Recursive Algorithms. On to More Efficient Sorting Algorithms.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.