Documenting code with the Six P’s
Topics/tags: CSC 151, technical
One of my first projects for this fall is to build a system to build better Web pages from the documentation I use for Scheme and Racket code. Grinnell’s CSC 151 has its own style of documentation, one I developed way too long ago. It doesn’t look like what the Racket community uses. Before working on to the software, I thought I should write a bit about why we have students write documentation and the particular form we use [1].
There is a trend in software development to say that documentation, or at least function-level documentation, is not important. After all, the tests should document what you expect a procedure to do. There’s also some long-standing evidence that programmers tend to fail to change documentation when they change the code it accompanies.
Nonetheless, we require our students to document their code. And we have good reasons to do so. First, writing documentation, particularly detailed documentation, forces you to think carefully about what you want your code to do [2]. Even if you write the documentation after the code, good documentation forces you to think more carefully about what the code should do. Second, we have an obligation to teach our students to communicate, and documentation provides a nice compromise between prose and code. That is, it’s a bit more formal in structure but less formal in grammar than prose. At the same time, it’s less formal (and potentially more readable) than code.
What do we expect? I call the form I designed the Six P’s
. I tried
to choose a form that was easy to remember. Given the difficulty I see
students have with it, I failed. But I haven’t come up with anything
better that still meets the goals I have when I ask students to write
documentation.
The Six P’s are Procedure, Parameters, Purpose, Product [3],
Preconditions, and Postconditions. What does all that mean?
Procedure
, Parameters
, and Product
are ways to name things.
For Procedure
, students give the name of the procedure. For
Parameters
, they give names and types of the parameters. Many
documentation systems only require that you give types. Why do I
require names? It turns out that having parameter names is useful
when you talk about what the procedure does. Similarly, the Product
gives a name and type for the result. Purpose
, as it suggests,
provides a narrative of the purpose of the procedure. What
about Preconditions
and Postconditions
? They provide a more
formal description of the requirements for the procedure to work
and the product of the procedure.
Let’s consider a seemingly straightforward example, a procedure that
we will call
. What does the word max
mean to you? You
may think that it’s obvious.max
Max is a procedure of two parameters, A and B, that sails off through the night and day, and in and out of A’s, to where the Wild B’s are.
Or maybe not. We’ll explore the
that is short for max
maximum
.
What are its parameters? Two numbers, which we’ll still call
and
a
. Can they be any numbers? Well, it doesn’t make sense to find
the larger of two complex numbers, so we should say that they are
real numbers [4]. What about the product? It’s also a real number.
Let’s call it b
. What does result
max
do? It finds the maximum
of a
and b
.
That seems pretty straightforward. So why do we need preconditions
and postconditions? In this case, there’s not a clear need for
preconditions. However, there are many situations, such as when
we’re working with the file system, where one needs to place additional
requirements on the state of the system. The postconditions are
there to make sure that we understand the effects unambiguously.
I tell my students that not everyone may understand the word
maximum
, so they should use the wonder of mathematics to be
precise. That means that one of our postconditions is that
result >= a
and another is that result >= b
. Is that enough?
Not if we’re trying to be precise. After all, if our inputs are
1
and 2
, these postconditions would permit us to return 3
,
which is larger than both 1
and 2
. However, most would not
accept 3
as the result of a procedure that computes a maximum. So
we’ll add another postcondition: Either result = a
or result = b
.
Putting it all together, this is what one might write.
;;; Procedure:
;;; max
;;; Parameters:
;;; a, a real number
;;; b, a real number
;;; Purpose:
;;; Find the maximum of a and b.
;;; Produces:
;;; result, a real number
;;; Preconditions:
;;; [No additional]
;;; Postconditions:
;;; result >= a
;;; result >= b
;;; result is either a or b
Wasn’t that fun?
Now, let’s make things a bit more complicated. Racket provides two
kinds of real numbers: Exact real numbers and inexact real numbers.
Inexact real numbers are represented with a fixed number of bits.
That means that some numbers can’t be represented precisely. Exact
numbers are represented, well, exactly. I tend to refer to exact
real numbers as arbitrary precision
. Here’s a quick example of
the difference.
> (+ .5 1/100000000000000000000000000)
0.5
> (+ 1/2 1/100000000000000000000000000)
50000000000000000000000001/100000000000000000000000000
As you can see, in the first case (inexact), the quite small bit that is added gets discarded, while in the second case (exact), it is retained.
What’s the implication for max
? Well, Racket’s built-in max
procedure returns an inexact number if any of its parameters
are inexact.
> (max 1/10 5/2)
2 1/2
> (max 0.1 5/2)
2.5
That doesn’t seem to bad, does it? But here’s the thing: If we give exact numbers as input and get inexact numbers as output, the result is not necessarily the same as the input.
> (max 0.1 (+ 1/2 1/100000000000000000000000000))
0.5
> (= 0.5 (+ 1/2 1/100000000000000000000000000))
#f
Whoops! There goes our first postcondition.
> (> 0.5 0.1)
#t
> (> 0.5 (+ 1/2 1/100000000000000000000000000))
#f
And there goes our second postcondition. Isn’t Racket fun? How do we solve the problem? The easy solution is to place another limit on the input.
;;; Parameters:
;;; a, an exact real number
;;; b, an exact real number
As I hope this example suggests, writing documentation, particularly the preconditions and postconditions, requires you to carefully consider many aspects of the procedure. Like all writing [6], writing documentation helps you think and helps you learn.
In a followup musing, we will consider how I might turn this style of documentation into an appropriate collection of Web pages.
Postscript: I also have a half-dozen or so other optional P’s
:
Practica (aka examples
), Package (aka collection of procedures
),
Philosophy (aka philosophy
), Props (aka citations), Problems
(aka
I’m too lazy to write good preconditions and postconditions,
so I’ll mention some ways this doesn’t work), Process (aka
algorithm),
Ponderings (aka
musings), Postscript (aka
Sam attempts to make
a recursive joke"), and some more that don’t come to mind.
Postscript: What if we want to document max
in such a way that we
describe what happens when some parameters are inexact? I’ll leave
that as an exercise for the reader.
[1] We
use it in the sense that faculty force students to use it,
and, so far, the folks teaching CSC 151 have been willing to follow
my lead.
[2] I strongly encourage my students to write documentation before they write code. I even try to model that methodology. Unfortunately, too few listen.
[3] I used Produces
, but I’m coming to realize that I should have used
Product
.
[4] Yes, that’s a discussion I have with my students.
[5] At least if you can read Scheme code.
[6] Well, maybe most writing.
Version 1.0 of 2019-09-05.