Skip to main content

Documenting code with the Six P’s

Topics/tags: CSC 151, technical

One of my first projects for this fall is to build a system to build better Web pages from the documentation I use for Scheme and Racket code. Grinnell’s CSC 151 has its own style of documentation, one I developed way too long ago. It doesn’t look like what the Racket community uses. Before working on to the software, I thought I should write a bit about why we have students write documentation and the particular form we use [1].

There is a trend in software development to say that documentation, or at least function-level documentation, is not important. After all, the tests should document what you expect a procedure to do. There’s also some long-standing evidence that programmers tend to fail to change documentation when they change the code it accompanies.

Nonetheless, we require our students to document their code. And we have good reasons to do so. First, writing documentation, particularly detailed documentation, forces you to think carefully about what you want your code to do [2]. Even if you write the documentation after the code, good documentation forces you to think more carefully about what the code should do. Second, we have an obligation to teach our students to communicate, and documentation provides a nice compromise between prose and code. That is, it’s a bit more formal in structure but less formal in grammar than prose. At the same time, it’s less formal (and potentially more readable) than code.

What do we expect? I call the form I designed the Six P’s. I tried to choose a form that was easy to remember. Given the difficulty I see students have with it, I failed. But I haven’t come up with anything better that still meets the goals I have when I ask students to write documentation.

The Six P’s are Procedure, Parameters, Purpose, Product [3], Preconditions, and Postconditions. What does all that mean? Procedure, Parameters, and Product are ways to name things. For Procedure, students give the name of the procedure. For Parameters, they give names and types of the parameters. Many documentation systems only require that you give types. Why do I require names? It turns out that having parameter names is useful when you talk about what the procedure does. Similarly, the Product gives a name and type for the result. Purpose, as it suggests, provides a narrative of the purpose of the procedure. What about Preconditions and Postconditions? They provide a more formal description of the requirements for the procedure to work and the product of the procedure.

Let’s consider a seemingly straightforward example, a procedure that we will call max. What does the word max mean to you? You may think that it’s obvious.

Max is a procedure of two parameters, A and B, that sails off through the night and day, and in and out of A’s, to where the Wild B’s are.

Or maybe not. We’ll explore the max that is short for maximum.

What are its parameters? Two numbers, which we’ll still call a and b. Can they be any numbers? Well, it doesn’t make sense to find the larger of two complex numbers, so we should say that they are real numbers [4]. What about the product? It’s also a real number. Let’s call it result. What does max do? It finds the maximum of a and b.

That seems pretty straightforward. So why do we need preconditions and postconditions? In this case, there’s not a clear need for preconditions. However, there are many situations, such as when we’re working with the file system, where one needs to place additional requirements on the state of the system. The postconditions are there to make sure that we understand the effects unambiguously. I tell my students that not everyone may understand the word maximum, so they should use the wonder of mathematics to be precise. That means that one of our postconditions is that result >= a and another is that result >= b. Is that enough? Not if we’re trying to be precise. After all, if our inputs are 1 and 2, these postconditions would permit us to return 3, which is larger than both 1 and 2. However, most would not accept 3 as the result of a procedure that computes a maximum. So we’ll add another postcondition: Either result = a or result = b.

Putting it all together, this is what one might write.

;;; Procedure:
;;;   max
;;; Parameters:
;;;   a, a real number
;;;   b, a real number
;;; Purpose:
;;;   Find the maximum of a and b.
;;; Produces:
;;;   result, a real number
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   result >= a
;;;   result >= b
;;;   result is either a or b

Wasn’t that fun?

Now, let’s make things a bit more complicated. Racket provides two kinds of real numbers: Exact real numbers and inexact real numbers. Inexact real numbers are represented with a fixed number of bits. That means that some numbers can’t be represented precisely. Exact numbers are represented, well, exactly. I tend to refer to exact real numbers as arbitrary precision. Here’s a quick example of the difference.

> (+ .5 1/100000000000000000000000000)
> (+ 1/2 1/100000000000000000000000000)

As you can see, in the first case (inexact), the quite small bit that is added gets discarded, while in the second case (exact), it is retained.

What’s the implication for max? Well, Racket’s built-in max procedure returns an inexact number if any of its parameters are inexact.

> (max 1/10 5/2)
2 1/2
> (max 0.1 5/2)

That doesn’t seem to bad, does it? But here’s the thing: If we give exact numbers as input and get inexact numbers as output, the result is not necessarily the same as the input.

> (max 0.1 (+ 1/2 1/100000000000000000000000000))
> (= 0.5 (+ 1/2 1/100000000000000000000000000))

Whoops! There goes our first postcondition.

> (> 0.5 0.1)
> (> 0.5 (+ 1/2 1/100000000000000000000000000))

And there goes our second postcondition. Isn’t Racket fun? How do we solve the problem? The easy solution is to place another limit on the input.

;;; Parameters:
;;;   a, an exact real number
;;;   b, an exact real number

As I hope this example suggests, writing documentation, particularly the preconditions and postconditions, requires you to carefully consider many aspects of the procedure. Like all writing [6], writing documentation helps you think and helps you learn.

In a followup musing, we will consider how I might turn this style of documentation into an appropriate collection of Web pages.

Postscript: I also have a half-dozen or so other optional P’s: Practica (aka examples), Package (aka collection of procedures), Philosophy (aka philosophy), Props (aka citations), Problems (akaI’m too lazy to write good preconditions and postconditions, so I’ll mention some ways this doesn’t work), Process (akaalgorithm), Ponderings (akamusings), Postscript (akaSam attempts to make a recursive joke"), and some more that don’t come to mind.

Postscript: What if we want to document max in such a way that we describe what happens when some parameters are inexact? I’ll leave that as an exercise for the reader.

[1] We use it in the sense that faculty force students to use it, and, so far, the folks teaching CSC 151 have been willing to follow my lead.

[2] I strongly encourage my students to write documentation before they write code. I even try to model that methodology. Unfortunately, too few listen.

[3] I used Produces, but I’m coming to realize that I should have used Product.

[4] Yes, that’s a discussion I have with my students.

[5] At least if you can read Scheme code.

[6] Well, maybe most writing.

Version 1.0 of 2019-09-05.