Documenting Procedures

Summary: We consider reasons and techniques for documenting procedures.

Introduction

When programmers write code, they also document that code; that is, they write natural language and a bit of mathematics to clarify what their code does. The computer certainly doesn’t need any such documentation (and even ignores it), so why should one take the time to write documentation? There are a host of reasons.

The design of an algorithm may not be obvious. Documentation can explain how the algorithm works.
Particular details of the implementation of the algorithm may include subtleties. Documentation can explain those subtleties.
Programmers who use a procedure (a.k.a. “client programmers”) should be able to focus more on what the procedure does, rather than how the procedure does its job. (You can certainly use sqrt, rgb->hsv, and a host of other procedures without understanding how they are defined.)

As all three examples suggest, when we write code, we write not just for the computer, but also for a human reader. Even the best of code needs to be checked again on occasion, and lots of code gets modified for new purposes. Good documentation helps those who must support or modify the code understand it. And while humans should be able to read code, most read code easier if the code has comments.

The Audience for Your Documentation

As you should have learned in Tutorial, every writer needs to keep in mind not only the topic they are writing about, but also the audience for whom they are writing. This understanding of audience is equally important when writing documentation.

One way to think about your audience is in terms of how the reader will be using your code. Some readers will be reading your code to understand techniques that they plan to use in other situations. Other readers will be responsible for maintaining and updating your code. Most readers will be using the procedures you write. You are often your own client. For example, you are likely to reuse procedures you wrote early in the semester. The documentation you write for your client programmers is the most important documentation you can write.

When thinking about those clients, you should first remember that they care most about what your procedures do: What values do they compute? What inputs do they take? Although you will be tempted to describe how you reach your results, most of your clients will not need to know your process, but only the result.

But you need to think about more than how your audience will use your code. You also need to think about what they know and don’t know. Because you are novices, you should generally plan to write for people like you: Assume that your client programmers know very little about Scheme, the kinds of things your program might do, even the terminology you use.

Documenting Procedures with the Six P’s

Different organizations have different styles of documentation. After too many years documenting procedures and teaching students to document procedures, Samuel A. Rebelsky developed a style that we find helps students think carefully about their work. (Sam has also received a few notes from alums and from other folks in industry who see this documentation and praise him for teaching it to students.)

To keep it easy to remember what belongs in the documentation for a procedure, Sam says that students should focus on “the Six P’s”: Procedure, Parameters, Purpose, Produces, Preconditions, and Postconditions.

The Procedure section simply names the procedure. Although the name of the procedure should be obvious from the code, by including the name in the documentation, we make it possible for the client programmer to learn about the procedure only through the documentation.

The Parameters section names the parameters to the procedure and gives them types. For example, if a procedure operates only on numbers or only on positive integers, the parameters section should indicate so.

The Purpose section presents a few sentences or sentence fragments that describe what the procedure is supposed to do. The sentences need not be as precise as what you’d give a computer, but they should be clear to the “average” programmer. (As you’ve learned in your other writing, write to your audience.)

The Produces section provides a name and type for the result of the procedure. Often, the result is not named in the code of the procedure. So why do we both to include such a section? Because naming the result lets us discuss it, either in the purpose above or in the preconditions and postconditions below.

Documenting Preconditions and Postconditions

These first four P’s give an informal definition of what the procedure does. The last two P’s give a much more formal definition of the purpose of the procedure. You’ve seen at the beginning of this reading that the preconditions are the conditions that must be met in order for the procedure to work and that preconditions and postconditions are a form of contract. Since they are a contract, we do our best to specify them formally.

The Preconditions section provides additional details on the valid inputs the procedures accepts and the state of the programming system that is necessary for the procedure to work correctly. For example, if a procedure depends on a value being named elsewhere, the dependency should be named in the preconditions section. The preconditions section can be used to include restrictions that would be too cumbersome to put in the parameters section. For example, in many programming languages, there is a cap on the size of integers. In such languages, it would therefore be necessary for a square procedure to put a cap on the size of the input value to have an absolute value less than or equal to the square root of the largest integer.

When documenting your procedures, you may wish to note whether a precondition is verified (in which case you should make sure to print an error message) or unverified (in which case your procedure may return any value).

The Postconditions section provides a more formal description of the results of the procedure. For example, we might say informally that a procedure reverses a list. In the postconditions section, we provide formulae that indicate what it means to have reversed the list.

Typically, some portion of the preconditions and postconditions are expressed with formulae or code.

How do you decide what preconditions and postconditions to write? It takes some practice to get good at it. We usually start by thinking about what inputs we are sure it works on and what inputs we are sure that it doesn’t work on. We then try to refine that understanding so that for any value someone presents, we can easily decide which category is appropriate.

For example, if we design a procedure to work on numbers, our general sense is that it will work on numbers, but not on the things that are not numbers. Next, we start to think about what kinds of number it won’t work on. For example, will it work correctly with real numbers, with imaginary numbers, with negative numbers, with really large numbers? As we reflect on each case, we refine our understanding of the procedure, and get closer to writing a good precondition.

The postcondition is a bit more tricky. We try to phrase what we expect of the procedure as concisely and clearly as we can, frequently using math, code when it’s clearer than the math, and English when we can’t quite figure out what math or code to write. But we always remember, consciously or subconsciously, that English is ambiguous, so we try to use formal notations whenever possible.

Note: When documenting preconditions, we generally don’t duplicate the type restrictions given in the Parameters section. You can assume that those are implicit preconditions.

Some Documentation Examples

Let us first consider a simple procedure that squares its input value and that restricts that value to an integer. Here is one possible set of documentation.

;;; Procedure:
;;;   square
;;; Parameters:
;;;   val, an integer
;;; Purpose:
;;;   Computes the square of val.
;;; Produces:
;;;;  result, an integer
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   result = val*val

In Scheme, there’s not an upper limit to the value of integers. In other languages, such a limit may exist. Let’s suppose there is such a limit and it is called MAXINT. In that case, trying to square a value larger than the square root of MAXINT will necessarily lead to an error. We might therefore add a precondition to the documentation as follows.

;;; Procedure:
;;;   square
;;; Parameters:
;;;   val, an integer
;;; Purpose:
;;;   Computes the square of val.
;;; Produces:
;;;;  result, an integer
;;; Preconditions:
;;;   (abs val) <= (sqrt MAXINT)
;;; Postconditions:
;;;   result = val*val

You will note that the preconditions specified are those described in the narrative section: We must ensure that val is not too large. Here, we started with the idea of numbers (or integers) and, as we started to think about special cases, realized that the procedure would not work with too large numbers. In reacting to the realization, we added a restriction to the size.

Let us now turn to the now-familiar irgb-lighter procedure.

What type of parameter does irgb-lighter take? It takes an integer-encoded RGB color. Do we restrict it at all? It doesn’t seem like we should, but we’ll revisit that question in a moment. What does it do? It makes a lighter version of the color. That’s a nice, informal definition. How do we formalize it? Well, we know that in the version we’ve been using, each component is 16 larger. Here’s that documentation.

;;; Procedure:
;;;   irgb-lighter
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute a lighter version of color
;;; Produces:
;;;   lighter, an integer-encoded RGB color
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   (irgb-red lighter) = (+ 16 (irgb-red color))
;;;   (irgb-green lighter) = (+ 16 (irgb-green color))
;;;   (irgb-blue lighter) = (+ 16 (irgb-blue color))

But is that correct? No! We’ve already seen that when a component is close to 255, we stop at 255, rather than at the component plus 16. What can we do? One possibility is to restrict the input.

;;; Procedure:
;;;   irgb-lighter
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute a lighter version of color
;;; Produces:
;;;   lighter, an integer-encoded RGB color
;;; Preconditions:
;;;   (irgb-red color) <= 249
;;;   (irgb-green color) <= 249
;;;   (irgb-blue color) <= 249
;;; Postconditions:
;;;   (irgb-red lighter) = (+ 16 (irgb-red color))
;;;   (irgb-green lighter) = (+ 16 (irgb-green color))
;;;   (irgb-blue lighter) = (+ 16 (irgb-blue color))

By restricting the preconditions, we’ve ensured that we can correctly meet the postconditions. But is that the best approach? That means that our client cannot use irgb-lighter with some colors, including some very common colors. Instead, we might decide to change the postconditions to clarify what happens in these special cases.

;;; Procedure:
;;;   irgb-lighter
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute a lighter version of color
;;; Produces:
;;;   lighter, an integer-encoded RGB color
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   If (irgb-red color) <= 239, then
;;;     (irgb-red lighter) = (+ 16 (irgb-red color))
;;;   Otherwise
;;;     (irgb-red lighter) = 255
;;;   If (irgb-green color) <= 239, then
;;;     (irgb-green lighter) = (+ 16 (irgb-green color))
;;;   Otherwise
;;;     (irgb-green lighter) = 255
;;;   If (irgb-blue color) <= 239, then
;;;     (irgb-blue lighter) = (+ 16 (irgb-blue color))
;;;   Otherwise
;;;     (irgb-blue lighter) = 255

That’s better, but perhaps a little verbose. Many programmers might prefer to express those postconditions a bit more concisely.

;;; Procedure:
;;;   irgb-lighter
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute a lighter version of color
;;; Produces:
;;;   lighter, an integer-encoded RGB color
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   (irgb-red lighter) = (min 255 (+ 16 (irgb-red color)))
;;;   (irgb-green lighter) = (min 255 (+ 16 (irgb-green color)))
;;;   (irgb-blue lighter) = (min 255 (+ 16 (irgb-blue color)))

That’s getting better. But this severely restricts our implementation. What if we decide to choose another way to make colors lighter (e.g., by increasing the luma, or by using an increment of some number other than 16)? Then we could not use this name. If our main goal is to make it lighter, then the particular details may be irrelevant.

We’ll introduce another procedure, irgb-brightness, that helps us model brightness. We leave the particulars of what brightness means out of the definition.

;;; Procedure:
;;;   irgb-brightness
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute the brightness of color
;;; Produces:
;;;   brightness, a real number
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   if c1 is likely to be perceived as lighter than c2 (e.g., by a
;;;   human eye or a photo-sensor), then brightness(c1) > brightness(c2).

Now, we can use that procedure in a better set of postconditions for irgb-lighter.

;;; Procedure:
;;;   irgb-lighter
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute a lighter version of color
;;; Produces:
;;;   lighter, an integer-encoded RGB color
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   (irgb-brightness lighter) > (irgb-brightness color)

Are we done? Unfortunately, no. There’s still one case in which we won’t achieve the postconditions: When color is white, we can’t make it any lighter. We might be tempted to make a small change to the postconditions.

;;; Postconditions:
;;;   (irgb-brightness lighter) >= (irgb-brightness color)

However, that’s not a good strategy. Why not? Because it means that whoever has to implement this can just write the following and still accomplish the postconditions.

(define irgb-lighter
  (lambda (color)
    color))

After all, this accomplishes the postconditions, and our postconditions are the contract.

Since we want to ensure that the programmer implements our procedure as we would expect, we have to be a little more careful in the postconditions. Here’s something that seems acceptable.

;;; Procedure:
;;;   irgb-lighter
;;; Parameters:
;;;   color, an integer-encoded RGB color
;;; Purpose:
;;;   Compute a lighter version of color
;;; Produces:
;;;   lighter, an integer-encoded RGB color
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   If color is white (i.e. (irgb 255 255 255)), then
;;;     lighter = color
;;;   Otherwise,
;;;     (irgb-brightness lighter) > (irgb-brightness color)

It took some effort to get the documentation right. But we hope that it was useful effort. First, it required us to carefully think through what we wanted the procedure to do and to differentiate aspects of our current implementation from the more general goals. Second, it required us to think about special cases. We’ll find that many of the procedures we write work fine on many cases, but not on the more extreme cases, which we will often call “edge cases” or “corner cases”. In this instance, the procedure behaved differently on large components. Finally, we had to balance the needs of the client programmer and the implementing programmer. You’ll find that a lot of procedure design requires such a balancing act.

Preconditions and Postconditions as Contracts, Revisited

As noted above, the preconditions and postconditions form a contract with the client programmer: If the client programmer meets the type specified in the parameters section and the preconditions specified in the preconditions section, then the procedure must meet the postconditions specified in the postconditions section.

As with all contracts, there is therefore a bit of adversarial balance between the preconditions and postconditions. The implementer’s goal is to do as little as possible to meet the postconditions, which means that the client’s goal is to make the postconditions specify the goal in such a way that the implementer cannot make compromises. Similarly, one of the client’s goals may be to break the procedure (so that he may sue or reduce payment to the implementer), so the implementer needs to write the preconditions and parameter descriptions in such a way that she can ensure that any parameters that meet those descriptions can be used to compute a result.

Just in case you weren’t sure: The way we’ve described the adversarial relationship is clearly hyperbole. Nonetheless, it’s useful to think hyperbolically to ensure that we write the best possible preconditions and postconditions.

Other P’s

Although we typically suggest using the basic six P’s (procedure, parameters, purpose, produces, preconditions, and postconditions) to document your procedures, there are a few other optional sections that you may wish to include. For the sake of alliteration, we also begin those sections with the letter P.

In a Problems section, you might note special cases or issues that are not sufficiently covered. Typically, all the problems are handled by eliminating invalid inputs in the preconditions section, but until you have a carefully written preconditions section, the problems section provides additional help (e.g., “the behavior of this procedure is undefined if the parameter is 0”).

In a Practicum section, you can give some sample interactions with your procedure. We find that many programmers best understand the purpose of programs through examples, and the Practicum section gives you the opportunity to give clarifying examples.

In a Process section, you can discuss a bit about the algorithm you use to go from parameters to product. In general, the client should not know your algorithm, but there are times when it is helpful to reveal a bit about the algorithm.

In a Philosophy section, you can discuss the broader role of this procedure. Often, procedures are written as part of a larger collection. This section gives you an opportunity to specify these relationships.

At least one of the faculty who uses the six-P notation often adds a Ponderings section for assorted notes about the procedure, its implementation, or its use.

In an overly-ambitious attempt to stick within the constraints of this notation, at least one faculty member adds a Props section to provide citations and other crediting of work.

You may find other useful P’s as your program. Feel free to introduce them into the documentation. (Feel free to use terms that do not begin with P, too.)

Self Checks

Check 1: Documenting `invert`

Write the 6P documentation for the inverse procedure you wrote for the reading on procedures.

Hint: Think carefully about restrictions on the input and why the resulting values of the first and last examples are different.

Accessibility statement. Check accessibility with WAVE.

This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

This website was built using Jekyll, Twitter Bootstrap, and the Bootswatch Cosmo Theme.