Skip to main content

The speed of (implementing) radix sort (#1167)

Topics/tags: Misc, Technical

This semester, I’m teaching CSC-301, Algorithm Analysis. I love CSC-301; I consider the material the heart of computer science. Students learn techniques for designing, implementing, and analyzing data structures and algorithms. Students also learn some of what many consider the primary literature of computer science; the data structures and algorithms that most good computer scientists know how to use and implement. The course involves theory (or at least math), but students also learn a lot from implementing algorithms. That is, like all the best CS courses, it combines both sides of the discipline.

I’m either incredibly nice or fairly mean: I make them implement algorithms in each of the three languages they should know from prerequisite courses: Scheme, C, and Java [1]. I don’t ask them to implement each algorithm in each language, but I’ll suggest, for example, that they should do balanced search trees in Scheme and Quickselect in C.

We’re currently exploring issues related to sorting. We’ve talked about general issues of sorting and proven that all comparison-based sorting algorithms are in Omega(nlogn).

Today we did the two primary O(n) sorting algorithms: bucket sort and radix sort [2]. I love radix sort. If you do it by hand, which I have students do, it’s almost impossible to tell what’s happening. Then, in the final round, the values magically end up in sorted order. The informal proof of correctness also involves loop invariants, which can be a nice lead into our discussion of loop invariants. I think I have that scheduled for next week.

As I noted, we did radix sort by hand. That is, I had my students do a live demo of radix sort using a set of large cards with letters and corresponding binary values today. I think the students had fun. And it was a reminder that you can execute algorithms without understanding them [3]. In some years, I’ve prepared punch cards, which permit me to do parallel radix sort. Unfortunately, with the office move this year, I couldn’t find them [4].

Surprisingly, I ended up with about twenty minutes left at the end of class. That’s rare. Traditionally, I over-prepare rather than under-prepare. So I said,

We can implement radix sort in C in the next fifteen minutes, right?

And we set forth. I cheated a bit. I was the driver and asked all of them to be navigators. But I followed reasonable driver practices; I explained what I was doing at every step, I checked in quickly, and I asked questions when I was stuck. They were also good navigators, catching stupid mistakes on my part, such as switching the array I used for accumulating the values with zero at a particular position and the array I used for the ones.

I also cheated in that I didn’t decompose the program into separate files. But with fifteen minutes, one just sees if they can get something working.

I haven’t written radix sort in the past five years. Come to think of it, I’m not sure if I’ve ever written radix sort. But I’ve recently written code to convert argc to an array of integers and also code to print out an array of integers [5]. So those tasks were quick. And radix sort is a straightforward algorithm.

The first compile was a failure. I’d given the wrong signature for main. I also thought I was using strtol wrong and so replaced it with atoi. (It is likely that I was using strtol correctly, and the incorrect signature for main was the cause.)

The second time through, the compile worked. And it sorted a variety of inputs!

Monday, we may need to have a discussion as to the implications of the speed at which I wrote that code, particularly with regards to my assumptions of what they can get done in eight hours. (I usually rely on a four-to-one ratio of their time to mine.)

For those who care, here’s the code.

#include <stdio.h>
#include <stdlib.h>

 * A quick and dirty implementation of radix sort.
 * Sam Rebelsky
 * And the Students of CSC-301-02 2021Fa

// +-----------+-----------------------------------------------------
// | Constants |
// +-----------+

#define MAX_VAL 2048
#define BITS 11

// +---------+-------------------------------------------------------
// | Helpers |
// +---------+

 * Sort vals, of size n, using radix sort.
radixSort(int vals[], int n) {
   int bitmask = 1;
   int zeros[n];        // Place to store zeros
   int ones[n];         // Place to store ones
   int z = 0;           // Number of values in the zeros array
   int o = 0;           // Number of values in the ones array

   for (int k = 0; k < BITS; k++) {
     // Clear out the arrays
     z = 0;
     o = 0;
     // For each value in vals
     for (int i = 0; i < n; i++) {
       // Add to the correct helepr array
       if (vals[i] & bitmask) {
         ones[o++] = vals[i];
       else {
         zeros[z++] = vals[i];
     } // for
     // Shove them back in vals in the correct order
     int i = 0;
     for (int j = 0; j < z; j++) {
       vals[i++] = zeros[j];
     for (int j = 0; j < o; j++) {
       vals[i++] = ones[j];
     } // for
     // Shift to the next digit
     bitmask = bitmask << 1;
   } // for
} // radixSort

printInts(int vals[], int n) 
  printf ("{ %d", vals[0]);
  for (int i = 1; i < n; i++) {
    printf (", %d", vals[i]);
  } // for
  printf (" }\n");
} // printInts

main (int argc, char **argv)
  int vals[argc - 1];
  for (int i = 1; i < argc; i++) {
    vals[i-1] = atoi(argv[i]);
  } // for
  radixSort(vals, argc-1);
  printInts(vals, argc-1);
} // main

Such a beatiful algorithm, even if it’s not a great setting.

[1] The argument for nice: I’m building their skills at thinking in multiple paradigms and in programming in different languages. The argument for mean: I require that they recall languages that they may not have used in a while. Plus, depending on the student, C, Java, or Scheme may feel like a form of torture. I find joy in each. Of course, I use techniques that are more fun in Java, like anonymous inner classes and lambdas.

[2] If we’ve just proven a lower bound of Omega(nlogn), how are we able to sort in O(n)? The lower bound is only for comparison-based sorting algorithms.

[3] We run the algorithm, looking at the results as we go. Then I challenge them to explain what just happened.

[4] Arguably, I might not have been able to find them in my old office. However, I did find my other sorting cards, so it’s more likely that they got separated in the move.

[5] I said something like, This is probably the 150th time I’ve written code to print an array of integers. I didn’t say You think I’d learn to keep it handy. But it may have been faster to rewrite it than to find it.

Version 1.0 of 2021-10-01.