Documentation, revisited

Due: Friday, 15 February 2019
Summary: We return to 6P-style documentation, considering, in particular, some subtleties of preconditions and postconditions.
Prerequisites: Documenting your procedures.

Introduction

As you may have noted from reading the documentation we’ve provided, as well as starting to write documentation on your own, the documentation that accompanies a procedure can serve a variety of important purposes. Here are two.

When we write documentation before we write a procedure, it can help us think through the goals of the procedure, particularly if we try to document things carefully or include a “Practicum” section.
When we are using a procedure someone else has written, the documentation can allow us to better understand what the procedure does without reading the details of the code.

In some situations, an informal description of the purpose of a procedure can leave some open questions about what it does. Consider, for example, the following documentation for a shadow procedure.

;;; Procedure:
;;;   shadow
;;; Parameters:
;;;   img, an image
;;; Purpose:
;;;   Add a shadow to image.
;;; Produces:
;;;   shadowed, an image

This documentation contains some important information. For example, we know the general goal of the procedure, the type of input we must provide, and the type of output we receive. However, it leaves some issues unstated. For example, if we are concerned about arranging the image with other images, we might want to know something about its width and height. If there are limitations on the image (perhaps it must have a certain minimum size), we want to know that to.

For situations like that, we also document preconditions and postconditions.

;;; Preconditions:
;;;   * (image-width img) >= 10
;;;   * (image-height img) >= 10
;;; Postconditions:
;;;   * (image-width shadowed) = (image-width img)
;;;   * (image-height shadowed) = (* 1.5 (image-height img))

As this example may suggest, the last two P’s not only give more detail, they also give a much more formal definition of the purpose of the procedure, typically using mathematics or code to ensure precision.

Documenting preconditions and postconditions

What should you think about as you write preconditions and postconditions?

The Preconditions section provides additional details on the valid inputs the procedures accepts and the state of the programming system that is necessary for the procedure to work correctly. For example, if a procedure depends on a value being named elsewhere, the dependency should be named in the preconditions section. The preconditions section can be used to include restrictions that would be too cumbersome to put in the parameters section. For example, in many programming languages, there is a cap on the size of integers. In such languages, it would therefore be necessary for a square procedure to put a cap on the size of the input value to have an absolute value less than or equal to the square root of the largest integer.

When documenting your procedures, you may wish to note whether a precondition is verified (in which case you should make sure to print an error message) or unverified (in which case your procedure may return any value). At this point in your career, we will assume that most preconditions are unverified.

Note: When documenting preconditions, we generally don’t duplicate the type restrictions given in the Parameters section. You can assume that those are implicit preconditions. At times, those are the only preconditions.

The Postconditions section provides a more formal description of the results of the procedure. For example, we might say informally that a procedure reverses a list. In the postconditions section, we provide formulae that indicate what it means to have reversed the list.

Typically, some portion of the preconditions and postconditions are expressed with formulae or code.

How do you decide what preconditions and postconditions to write? It takes practice to get good at it. We usually start by thinking about what inputs we are sure it works on and what inputs we are sure that it doesn’t work on. We then try to refine that understanding so that for any value someone presents, we can easily decide which category is appropriate.

For example, if we design a procedure to work on numbers, our general sense is that it will work on numbers, but not on the things that are not numbers. Next, we start to think about what kinds of number it won’t work on. For example, will it work correctly with real numbers, with imaginary numbers, with negative numbers, with really large numbers? As we reflect on each case, we refine our understanding of the procedure, and get closer to writing a good precondition.

The postcondition is a bit more tricky. We try to phrase what we expect of the procedure as concisely and clearly as we can, frequently using math, code when it’s clearer than the math, and English when we can’t quite figure out what math or code to write. But we always remember, consciously or subconsciously, that English is ambiguous, so we try to use formal notations whenever possible.

You may recall the following postconditions from [our earlier discussion of documentation].

;;; Procedure:
;;;   sqr
;;; Parameters:
;;;   num, a number
;;; Purpose:
;;;   Compute the square of num
;;; Produces:
;;;   result, a number
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   If num is exact, (sqrt result) = num
;;;   If num is inexact, (sqrt result) approximates num
;;;   result has the same "type" as num
;;;     If num is an integer, result is an integer
;;;     If num is real, result is real
;;;     If num is exact, result is exact
;;;     If num is inexact, result is inexact
;;;     And so on and so forth

In this case, we’ve tried to provide the client with a host of information. As you have no doubt seen in your work with numbers in Racket, it’s important to know whether we are dealing with approximate (inexact) or exact numbers. It may also be nice to know if we’re dealing with integers, real numbers, or complex numbers. The postconditions allow us to specify all of that without making the purpose section overly long.

An extended example: Raising grades

Let’s look at another example. Suppose a colleague in another department comes to us and says “I was too harsh on the last exam; the average was only 60. The average should really be closer to 85. I need a program to scale the grades appropriately.” We start by thinking about the documentation. We’ll need to ask a few questions first.

Do you represent grades as whole numbers, or can grades have a fraction?

Oh, grades can have a fractional portion. For example, a student might have 72.4.

When you say that the average should be close to 85, do you mean that we should add 25 (that’s 85-60) or that we should multiply by 85/60?

Let’s just add 25.

Here’s a first attempt at the documentation.

;;; Procedure:
;;;   scale-grades
;;; Parameters:
;;;   grades, a list of real numbers
;;; Purpose:
;;;   scale all of the grades in the list so that the average is 85.
;;; Produces:
;;;   scaled-grades, a list of real numbers
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   For each grade in the list, we have added 25 points.

Is that enough? Probably not. It’s a bit vague how the grades in the second list correspond exactly to the grades in the first list. We could, for example, achieve the “average 85” goal by just creating a list with one element whose value is 85. Even if we cover all the grades, must they be in the same order? Let’s clarify.

;;; Procedure:
;;;   scale-grades
;;; Parameters:
;;;   grades, a list of real numbers
;;; Purpose:
;;;   scale all of the grades in the list
;;; Produces:
;;;   scaled-grades, a list of real numbers
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   * (length scaled-grades) = (length grades)
;;;   * (average scaled-grades) is approximately 85
;;;   * For all i, 0 <= i < (length grades)
;;;       (list-ref scaled-grades i) = (+ 25 (list-ref grades i))

The use of Scheme clarifies things a bit. But we’re not done yet. We should think about the possible limits of grades. Can a student have a negative grade? We’d hope not. Can a student have a grade over 100 (e.g., if they started with a relatively high grade)? Let’s suppose we’ve asked and been told that the maximum grade is 105.

;;; Procedure:
;;;   scale-grades
;;; Parameters:
;;;   grades, a list of non-negative real numbers.
;;; Purpose:
;;;   scale all of the grades in the list.
;;; Produces:
;;;   scaled-grades, a list of non-negative real numbers.
;;; Preconditions:
;;;   All numbers in grades are <= 105.
;;; Postconditions:
;;;   (length scaled-grades) = (length grades)
;;;   (average scaled-grades) is approximately 85
;;;   For all i, 0 <= i < (length grades)
;;;     If (list-ref grades i) <= 80
;;;       (list-ref scaled-grades i) = (+ 10 (list-ref grades i))
;;;     Otherwise
;;;       (list-ref scaled-grades i) = 105

But wait! If we’re not scaling all grades the same, then we may not achieve the average of 85. We’ll need to drop that guarantee and perhaps think of another one. It’s hard to say clearly what we’ll get. The scaled average has to be higher than the old average. It can’t be higher than 85. So let’s go with the following.

;;;   60 <= (average scaled-grades) <= 85

But can we really guarantee that? We’re taking their word that the average grade is 60. Maybe we should just rely on what we know.

;;;   (average grades) < (average scaled-grades) <= 85

Are we sure about that? We know that none of the original grades is negative, so adding ten and scaling will make them larger. We know that none of them start greater than 105, so we can’t accidentally reduce one to 85. Yes, that’s safe. Of course, if the average started out higher than 85, this doesn’t work. So maybe we want to add that as a precondition.

;;; Preconditions:
;;;   All numbers in grades are <= 105.
;;;   (average grades) = 60

That would fix our earlier postcondition problem, too. But we’ll leave the postcondition as is. Are we done? Not quite. We’ve “hard coded” the strategy in our documentation. What if they decide to have us use a different formula, such as adding 10 and then multiplying by 85/70? Perhaps we should guarantee less.

;;; Procedure:
;;;   scale-grades
;;; Parameters:
;;;   grades, a list of non-negative real numbers.
;;; Purpose:
;;;   scale all of the grades in the list.
;;; Produces:
;;;   scaled-grades, a list of non-negative real numbers.
;;; Preconditions:
;;;   All numbers in grades are <= 105.
;;;   (average grades) = 60
;;; Postconditions:
;;;   (length scaled-grades) = (length grades)
;;;   (average grades) < (average scaled-grades) <= 85
;;;   For all i, 0 <= i < (length grades)
;;;     (list-ref scaled-grades i) > (list-ref grades i)
;;;     (list-ref scaled-grades i) = 105

No, that doesn’t seem quite right. We haven’t, for example, guaranteed that the scaling is “fair”, in that grades retain their order. We’ll add one more postcondition.

;;;   For all i,j, 0 <= i,j < (length grades)
;;;     if (list-ref grades i) <= (list-ref grades j) then
;;;       (list-ref scaled-grades i) <= (list-ref scaled-grades j)

That’s better. Will it stop the programmer from just giving everyone a grade of 105? Yes, since that would fail to meet the average postcondition.

There’s still more we might consider as we think about what to guarantee. For example, it could be useful to make the desired average a parameter to the procedure so that we can generalize it further. But we will leave such generalization for the future.

It took a bit of effort to get the documentation right, or close enough to right. We hope that it was useful effort. First, it required us to carefully think through what we wanted the procedure to do and to differentiate aspects of our current implementation from the more general goals. Second, it required us to think about special cases. We’ll find that many of the procedures we write work fine on many cases, but not on the more extreme cases, which we will often call “edge cases” or “corner cases”. In this instance, the procedure behaved differently on large components. Finally, we had to balance the needs of the client programmer and the implementing programmer. You’ll find that a lot of procedure design requires such a balancing act.

Preconditions and postconditions as contracts

We sometimes find it helpful to consider the preconditions and postconditions as a form of contract with the client programmer: If the client programmer meets the type specified in the parameters section and the preconditions specified in the preconditions section, then the procedure must meet the postconditions specified in the postconditions section.

As with all contracts, there is therefore a bit of adversarial balance between the preconditions and postconditions. The implementer’s goal is to do as little as possible to meet the postconditions, which means that the client’s goal is to make the postconditions specify the goal in such a way that the implementer cannot make compromises. Similarly, one of the client’s goals may be to break the procedure (so that he may sue or reduce payment to the implementer), so the implementer needs to write the preconditions and parameter descriptions in such a way that she can ensure that any parameters that meet those descriptions can be used to compute a result.

Just in case you weren’t sure: The way we’ve described the adversarial relationship is clearly hyperbole. Nonetheless, it’s useful to think hyperbolically to ensure that we write the best possible preconditions and postconditions.

Fewer P’s

As the examples above suggest, the preconditions and postconditions help you think more carefully about exactly what you want the procedure to do and to help others understand that, too. But preconditions and postconditions can take a lot of thought. In many cases when you are writing “obvious” procedures or procedures whose primary audience is you, you may choose to write 4P’s, and do without preconditions and postconditions. We will try to indicate when only 4P’s are appropriate.

More P’s

Although we typically suggest using the basic six P’s (procedure, parameters, purpose, produces, preconditions, and postconditions) to document your procedures, there are a few other optional sections that you may wish to include. For the sake of alliteration, we also begin those sections with the letter P.

In a Package section, you might name the group of code the procedure is associated with. For example, section is in the package loudhum.

In a Problems section, you might note special cases or issues that are not sufficiently covered. Typically, all the problems are handled by eliminating invalid inputs in the preconditions section, but until you have a carefully written preconditions section, the problems section provides additional help (e.g., “the behavior of this procedure is undefined if the parameter is 0”).

In a Practicum section, you can give some sample interactions with your procedure. We find that many programmers best understand the purpose of programs through examples, and the Practicum section gives you the opportunity to give clarifying examples.

In a Process section, you can discuss a bit about the algorithm you use to go from parameters to product. In general, the client should not know your algorithm, but there are times when it is helpful to reveal a bit about the algorithm.

In a Philosophy section, you can discuss the broader role of this procedure. Often, procedures are written as part of a larger collection. This section gives you an opportunity to specify these relationships.

At least one of the faculty who uses the six-P notation often adds a Ponderings section for assorted notes about the procedure, its implementation, or its use.

In an overly-ambitious attempt to stick within the constraints of this notation, at least one faculty member adds a Props section to provide citations and other crediting of work.

You may find other useful P’s as your program. Feel free to introduce them into the documentation. (Feel free to use terms that do not begin with P, too.)

Self checks

Check 1: Why document?

In the introduction, we wrote

When we write documentation before we write a procedure, it can help us think through the goals of the procedure, particularly if we try to document things carefully or include a “Practicum” section.

How does the grade scaling example support or refute that claim?

Check 2: How many P’s?

We tend to focus on 6Ps. But there are others. How many P’s total? And what are they? Can you think of even more that might be useful?

Acknowledgements

Much of this reading is based on a prior reading on documentation taken from the Spring 2018 version of CSC 151. We’ve split that reading into two parts.

Copyright © Charlie Curtsinger, Sarah Dahlby Albright, Janet Davis, Fahmida Hamid, Titus Klinge, Samuel A. Rebelsky, and Jerod Weinman. Selected materials are copyright by John David Stone or Henry Walker and are used with permission.

Unless specified otherwise elsewhere on this page, this work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

This website was built using Jekyll, Twitter Bootstrap, and the Bootswatch Cosmo Theme.