Fundamentals of Computer Science I: Media Computing (CS151.01 2008S)

Recursion with Helper Procedures


Summary: You've now seen a bit of the basics of recursion as a program design strategy. We now consider an alternate technique for writing recursive procedures, one in which we build helpers that carry along a partial result at each step. We also see how such helpers can be used to create tail recursive solutions, which are often more efficient than more basic recursive procedures.

Introduction

There are a few basic principles to writing recursive functions. As you may recall from the first reading on recursion, a recursive procedure is one that calls itself. To successfully write a recursive procedure, you need

  • a base case, in which you compute the result without a recursive call;
  • a recursive case, in which you compute a partial result recursively, and then expand it into the complete result; and
  • a test, in which you decide whether to use the base case or the recursive case.

A disadvantage of traditional recursion is that it is difficult to tell what is happening along the way. In the particular case of list recursion, we work all the way down to the end of the list, and then work our way back to the front of the list to compute a final result. Why not just compute the result as we go? We consider that idea in this reading.

Review: Scheme's Evaluation Strategy

Before we go much further, let's take a quick detour into an important concept that we've discussed before: When Scheme has a nested expression to evaluate, how does it evaluate that expression? The technique it uses is pretty straightforward: In order to apply a procedure to parameters, those parameters must be in simple form (that is, values, rather than expressions to compute values). Hence, given (proc exp1 exp2 exp3), Scheme first evaluates exp1, exp2, and exp3, looks up proc in the environment, and then applies the procedure it finds to the three resulting values.

In what order does it evaluate exp1, exp2, and exp3? It turns out that it depends on the implementation of Scheme. In our sample code, we'll always assume that Scheme evaluates a procedure from left to right.

Of course, for some operations (which we call “keywords” or “syntax”), Scheme uses a different evaluation strategy. For example, when evaluating an if expression, Scheme does not evaluate all three parameters. Instead, it evaluates them in a specified order (first the test, then only one of the consequent and alternate). Similarly, when evaluating an and or an or, Scheme evaluates the parameters one at a time, stopping as soon as it knows an answer (in the case of and, when it hits a #f; in the case of or, when it hits a #t).

But there's more. To understand how Scheme evaluates expressions, we must also understand how it applies the procedures you define and how it processes definitions in general.

The naming scheme that Scheme uses is relatively straightforward. Scheme keeps a table of correspondences between names and values. For each name, it stores a reference to the value, which means that two names can refer to the same value. (This is important to know, because if you change a value with a side-effecting function, then all references to that value seem to change.) If you write another definition using the same name, it changes the reference, but not the underlying value. When Scheme sees a name while evaluating an expression, it looks up the name in the table, and uses the value the name refers to.

The way procedures work takes advantage of this naming scheme. When you call a procedure on some arguments, the Scheme interpreter first updates the name table, mapping each name in the lambda to the corresponding argument. For example, if you apply (lambda (a b c) ...) to 3, 1, and 6, the table will now have a refer to 3, b refer to 1, and c refer to 6. After updating the table, the interpreter evaluates the body of the procedure. When it finishes evaluating the body, it cleans up the name table, removing any new entries it had added.

Why is the way Scheme evaluates and understands expressions important for recursion? First, it may help you understand how recursion works. Because if (or cond), delays the evaluation of the consequent and alternate, we can safely have a procedure call itself without worrying about it calling itself again and again and again, ad infinitum. Second, it clarifies how we can have the same names (the parameters) mean different things at different times.

An Alternative Sum

Now, let's return to our goal: Looking at other ways to do recursion. We'll consider the question of computing results as we go with the summation example from the previous reading. As you may have noted, an odd thing about the sum procedure is that it works from right to left. Traditionally, we sum from left to right. Can we rewrite sum to work from left to right? Certainly, but we may need a helper procedure (another procedure whose primary purpose is to assist our current procedure) to do so.

If you think about it, when you're summing a list of numbers from left to right, you need to keep track of two different things:

  • The sum of the values seen so far.
  • The remaining values to add.

Hence, we'll build our helper procedure with two parameters, sum-so-far and remaining. We'll start the body with a template for recursive procedures (a test to determine whether to use the base case or recursive case, the base case, and the recursive case). We'll then fill in each part.

(define new-sum-helper
  (lambda (sum-so-far remaining)
     (if (test)
         base-case
         recursive-case)))

The recursive case is fairly easy. Recall that since remaining is a list, we can split it into two parts, the first element (that is, the car), and the remaining elements (that is, the cdr). Each part contributes to one of the parameters of the recursive call. We update sum-so-far by adding the first element of remaining to sum-so-far. We update remaining by removing the first element. To “continue”, we simply call new-sum-helper again with those updated parameters.

(define new-sum-helper
  (lambda (sum-so-far remaining)
     (if (test)
         base-case
         (new-sum-helper (+ sum-so-far (car remaining))
                         (cdr remaining)))))

The recursive case then gives us a clue as to what to use for the test. We need to stop when there are no elements left in the list. (Otherwise, taking the car and the cdr would create errors.)

(define new-sum-helper
  (lambda (sum-so-far remaining)
     (if (null? remaining)
         base-case
         (new-sum-helper (+ sum-so-far (car remaining))
                         (cdr remaining)))))

We're almost done. What should the base case be? In the previous version, it was 0. However, in this case, we've been keeping a running sum. When we run out of things to add, the value of the complete sum is the value of the running sum.

(define new-sum-helper
  (lambda (sum-so-far remaining)
     (if (null? remaining)
         sum-so-far
         (new-sum-helper (+ sum-so-far (car remaining))
                         (cdr remaining)))))

Now we're ready to write the primary procedure whose responsibility it is to call new-sum-helper. Like sum, new-sum will take a list as a parameter. That list will become remaining. What value should sum-so-far begin with? Since we have not yet added anything when we start, it begins at 0.

(define new-sum
  (lambda (numbers)
    (new-sum-helper 0 numbers)))
Putting it all together, we get the following.
;;; Procedure:
;;;   new-sum
;;; Parameters:
;;;   numbers, a list of numbers.
;;; Purpose:
;;;   Find the sum of the elements of a given list of numbers
;;; Produces:
;;;   total, a number.
;;; Preconditions:
;;;   All the elements of numbers must be numbers.
;;; Postcondition:
;;;   total is the result of adding together all of the elements of numbers.
;;;   If all the values in numbers are exact, total is exact.
;;;   If any values in numbers are inexact, total is inexact.
(define new-sum
  (lambda (numbers)
    (new-sum-helper 0 numbers)))

;;; Procedure:
;;;   new-sum-helper
;;; Parameters:
;;;   sum-so-far, a number.
;;;   remaining, a list of numbers.
;;; Purpose:
;;;   Add sum-so-far to the sum of the elements of a given list of numbers
;;; Produces:
;;;   total, a number.
;;; Preconditions:
;;;   All the elements of remaining must be numbers.
;;;   sum-so-far must be a number.
;;; Postcondition:
;;;   total is the result of adding together sum-so-far and all of the
;;;     elements of remaining.
;;;   If both sum-so-far and all the values in remaining are exact,
;;;     total is exact.
;;;   If either sum-so-far or any values in remaining are inexact,
;;;     total is inexact.
(define new-sum-helper
  (lambda (sum-so-far remaining)
     (if (null? remaining)
         sum-so-far
         (new-sum-helper (+ sum-so-far (car remaining))
                         (cdr remaining)))))

Note that it is a bit difficult to write the documentation for the helper procedure, since we have to think about the role of the sum-so-far parameter in the context of the overall goal. At the same time, doing so helps us clarify exactly what we expect the procedure to do. It also helps us verify that the base case is appropriate. In this case, since we're supposed to add the intermediate sum to the sum of the remaining values, when there are no remaining values, we can just add 0 and stop.

Watching New Sum

Does this change make a difference in the way in which the sum is evaluated? Let's watch.

   (new-sum (cons 38 (cons 12 (cons 83 null)))) 
=> (new-sum-helper 0 (cons 38 (cons 12 (cons 83 null)))) 
=> (new-sum-helper (+ 0 38) (cons 12 (cons 83 null)))
=> (new-sum-helper 38 (cons 12 (cons 83 null)))
=> (new-sum-helper (+ 38 12) (cons 83 null))
=> (new-sum-helper 50 (cons 83 null))
=> (new-sum-helper (+ 50 83) null)
=> (new-sum-helper 133 null)
=> 133

Note that the intermediate results for new-sum were different, primarily because new-sum operates from left to right.

The Technique, Generalized

Okay, how does one write one of these procedures? First, you build a helper procedure that takes as parameters the list you're recursing over, which we've called remaining, a new parameter, ____-so-far, and any other useful parameters. The helper recurses over the list, updating that list to its cdr at each step and updates ____-so-far to be the next partial result. At the end, it typically returns ____-so-far or something computed from that value.

After creating the helper, you build the main procedure, which calls the helper with an appropriate initial ____-so-far and remaining. For example,

(define proc
  (lambda (lst)
    (proc-helper initial-value lst)))
(define proc-helper
  (lambda (so-far remaining)
    (if (null? remaining)
        so-far
        (proc-helper (combine so-far (car remaining))
                     (cdr remaining)))))

However, as we'll see in the next section, you sometimes need to use different initial values for so-far and remaining.

Another Application: Difference

We've now seen two strategies for doing recursion: We can combine the result of a recursive call into a final result, or we can pass along intermediate results as we recurse, and stop with the final result. Which should you use? For many applications, it doesn't matter. However, there are times that one technique is much more appropriate than the other. Consider a procedure that takes a list of numbers, n1, n2, n3, ... nk-1, nk, and computes n1 - n2 - n3 - ... - nk-1 - nk.

Suppose we use the first technique. We might write:

(define difference
  (lambda (lst)
    (if (null? lst)
        0
        (- (car lst) (difference (cdr lst))))))

Unfortunately, this procedure doesn't quite work as you might expect. Consider the computation of (difference (list 1 2 3)). As you may recall, subtraction is left associative, so this should be (1-2)-3, or -4. Let's see what happens.

   (difference (cons 1 (cons 2 (cons 3 null))))
=> (- 1 (difference (cons 2 (cons 3 null))))
=> (- 1 (- 2 (difference (cons 3 null))))
=> (- 1 (- 2 (- 3 (difference null))))
=> (- 1 (- 2 (- 3 0)))
=> (- 1 (- 2 3))
=> (- 1 -1)
=> 2

Hmmm ... that's not quite right. So, let's try the new technique of using a helper.

(define new-difference
  (lambda (lst)
    (new-difference-helper 0 lst)))

(define new-difference-helper
  (lambda (difference-so-far remaining)
    (if (null? remaining)
        difference-so-far
        (new-difference-helper (- difference-so-far (car remaining))
                               (cdr remaining)))))

Okay, what happens here?

   (new-difference (list 1 2 3))
=> (new-difference-helper 0 (cons 1 (cons 2 (cons 3 null))))
=> (new-difference-helper (- 0 1) (cons 2 (cons 3 null)))
=> (new-difference-helper -1 (cons 2 (cons 3 null)))
=> (new-difference-helper (- -1 2) (cons 3 null))
=> (new-difference-helper -3 (cons 3 null))
=> (new-difference-helper (- -3 3) null)
=> (new-difference-helper -6 null)
=> -6

Nope. Still incorrect. So, what happened? We started with 0, then we subtracted 1, then we subtracted 2, then we subtracted 3. It looks like we did those last two subtractions in order. But wait! Why are we subtracting 1? That's the value we're supposed to subtract everything else from. That means we need to choose a better initial difference-so-far and remaining. Why not make the car the difference-so-far and the cdr remaining?

(define newer-difference
  (lambda (lst)
    (new-difference-helper (car lst) (cdr lst))))

Does this work?

   (newer-difference (list 1 2 3))
=> (new-difference-helper 1 (cons 2 (cons 3 null)))
=> (new-difference-helper (- 1 2) (cons 3 null))
=> (new-difference-helper -1 (cons 3 null))
=> (new-difference-helper (- -1 3) null)
=> (new-difference-helper -4 null)
=> -4

It works! Well, it works for this example. In the lab, we'll explore whether it works for other examples, too.

Watching Recursive Helpers

Sometimes, students ask “Can I watch the recursion?” Unfortunately, it is difficult to make Scheme show you all the steps in a typical recursive procedure, there is no way to access the delayed actions. But helper recursion makes life a bit easier, since we typically don't have delayed actions. If you want to watch what is happening in a helper recursive procedure, you can add commands to print things out (immediately after the lambda).

For example, here is an annotated version of the last version of the difference procedure.

(define annotated-difference
  (lambda (lst)
    (annotated-difference-helper (car lst) (cdr lst))))
(define annotated-difference-helper
  (lambda (difference-so-far remaining)
    (write (list 'difference-helper difference-so-far remaining)) (newline)
    (if (null? remaining)
        difference-so-far
        (annotated-difference-helper (- difference-so-far (car remaining))
                                     (cdr remaining)))))

Here's some sample output.

> (annotated-difference (list 1 2 3 4 5))
-13
(difference-helper 1 (2 3 4 5))
(difference-helper -1 (3 4 5))
(difference-helper -4 (4 5))
(difference-helper -8 (5))
(difference-helper -13 ())

Terminology: Tail Recursion

You have probably noticed an important difference between the first form of recursion we used and the helper version: In the original form, once we reached the base case, we had to back up and do more computation. In this new form, it seems that once we reach the base case, we're done. Behind the scenes for the original form, the Scheme interpreter has to spend extra effort to remember the extra work to be done after the base case is reached. In this new form, it does not.

When the computation in a recursive procedure is completed when it reaches the base case, the procedure is called tail recursive. A tail-recursive version of a procedure is often (but not always) more efficient than the non-tail-recursive version. Hence, many Scheme programmers regularly convert their procedures to tail-recursive form. How do you convert to this form? Most frequently, you use the techniques in this reading: You create a recursive helper procedure that accumulates the value to be computed.

Recursion and Efficiency

When we write recursive procedures carelessly, we can create very inefficient computation. Let us consider, for a moment, the problem of computing the brightest color in a list of colors.

Before writing any code, we should, of course write the introductory documentation.

;;; Procedure:
;;;   rgb-brightest
;;; Parameters:
;;;   colors, a list of RGB colors
;;; Purpose:
;;;   Find the brightest color in the list.
;;; Produces:
;;;   brightest, an RGB color.
;;; Preconditions:
;;;   rgb-brightness is defined.
;;;   colors contains at least one element.
;;; Postconditions:
;;;   For any color in colors,
;;;    (rgb-brightness color) <= (rgb-brightness brightest)
;;;   brightest is an element of colors.

Now we are ready to write some code. Let us first consider a basic recursive solution. We start with the template of a recursive procedure.

(define rgb-brightest
  (lambda (colors)
    (if (test)
        base-case
        recursive-case)))

The test and base case are straightforward: If there is only one value left in the list, it is clearly the brightest color.

(define rgb-brightest
  (lambda (colors)
    (if (null? (cdr colors))
        (car colors)
        recursive-case)))

But what about the recursive case? Suppose we compute the brightest color in the remainder of the list. What does that tell us about the brightest color in the list? Well, the brightest color is either the brightest color in the remainder or the first color, whichever is brighter. We can express that choice as a second conditional.

         (if (>= (rgb-brightness (car colors)) 
                 (rgb-brightness (rgb-brightest (cdr colors))))
             (car colors)
             (rgb-brightest (cdr colors)))

When we plug that into the rest of the procedure, we get

(define rgb-brightest
  (lambda (colors)
    (if (null? (cdr colors))
        (car colors)
        (if (>= (rgb-brightness (car colors)) 
                (rgb-brightness (rgb-brightest (cdr colors))))
            (car colors)
            (rgb-brightest (cdr colors))))))

Most Scheme programmers, when confronted with these nested if statements, will turn them into a cond statement.

(define rgb-brightest
  (lambda (colors)
    (cond
      ((null? (cdr colors))
       (car colors))
      ((>= (rgb-brightness (car colors)) 
	   (rgb-brightness (rgb-brightest (cdr colors))))
       (car colors))
      (else
       (rgb-brightest (cdr colors))))))

Both of these solutions are correct. However, both are quite inefficient. What makes them inefficient? The two recursive calls. Normally, you wouldn't think that two calls would make a huge difference (well, maybe a factor of two). However, for recursive procedures, the expense of the extra call grows exponentially.

Consider the particular case of of a list of five elements in which the last element is the brightest. In the test, we recurse on the list of four elements in the test. When that recursive call finishes, we recurse again on the same list in order to compute the alternate. Now, let us consider one of those identical recursive calls. Once again, the test requires us to recurse on the cdr (the list of three elements). And, once again, we need to recurse on that same list hen we compute the alternate. Since we had to find the brightest element of the list of four elements twice, we need to find the brightest element of the list of three elements four times.

Now, each of those computations (that is, finding the brightest in a list of three elements), needs to recurse on a list of two elements twice, once for the test and once for the alternate. Given that we found the brightest element of the same list of three elements four times, we find the brightest element of the same list of two elements eight times.

We're almost done with the analysis. For each of those identical computations of the brightest element in a list of two elements, we find the brightest element in the singleton list twice (once for the test, once for the alternate). Since there are eight calls involving the list of two elements, there are sixteen calls involving the list of one element.

How calls to rgb-brightest did we end up with? Let's see ... one for the list of five elements, two for the sublist of four elements, four for the sublist of three elements, eight for the sublist of two elements, and sixteen for the sublist of one element. 1 + 2 + 4 + 8 + 16 is 31. That's a lot of calls for a list of five colors. If we had a list of six colors, it could take 63 calls. If we had a list of ten colors, it could take over one thousand calls.

You would hope that we could come up with a solution that does not require many more recursive calls than the elements of the list. And such a solution is possible with a helper. This time, we'll keep track of the brightest color we've seen so far. We'll model the solution on difference, which had the following form:

(define proc
  (lambda (lst)
    (proc-helper (car lst) (cdr lst))))
(define proc-helper
  (lambda (so-far remaining)
    (if (null? remaining)
        so-far
        (proc-helper (combine so-far (car remaining))
                     (cdr remaining)))))

We'll start by plugging in appropriate names.

(define rgb-brightest
  (lambda (colors)
    (rgb-brightest-helper (car colors) (cdr colors))))
(define rgb-brightest-helper
  (lambda (brightest-so-far colors-remaining)
    (if (null? colors-remaining)
        brightest-so-far
        (rgb-brightest-helper
         (combine brightest-so-far (car remaining-colors))
         (cdr remaining-colors)))))

Now, how do we compute a new brightest color so far, given a previous brightest color so far and one additional color? If the previous color is brighter, we keep it. If the new color is brighter, we use it.

(define rgb-brightest
  (lambda (colors)
    (rgb-brightest-helper (car colors) (cdr colors))))
(define rgb-brightest-helper
  (lambda (brightest-so-far colors-remaining)
    (if (null? colors-remaining)
        brightest-so-far
        (rgb-brightest-helper
         (if (>= (rgb-brightness brightest-so-far) 
                 (rgb-brightness (car colors-remaining)))
             brightest-so-far
             (car colors-remaining))
         (cdr remaining-colors)))))

This solution is both correct and efficient, but some folks have trouble reading that conditional in the middle. We can clarify the code by writing a procedure, (rgb-brighter color1 color2), that finds the brighter of two colors. Once we've written rgb-brighter, we can rewrite the helper as

(define rgb-brightest
  (lambda (colors)
    (rgb-brightest-helper (car colors) (cdr colors))))
(define rgb-brightest-helper
  (lambda (brightest-so-far colors-remaining)
    (if (null? colors-remaining)
        brightest-so-far
        (rgb-brightest-helper
         (rgb-brighter brightest-so-far (car colors-remaining))
         (cdr remaining-colors)))))

Now, let's consider what happens when we try to find the brightest color in the list of black, blue, red, green, and white. (We'll use the names in the example. Scheme would use their numeric values.)

   (rgb-brightest (list black blue red green white))
=> (rgb-brightest-helper black (list blue red green white))
=> (rgb-brightest-helper (rgb-brighter black blue) (list red green white))
=> (rgb-brightest-helper blue (list red green white))
=> (rgb-brightest-helper (rgb-brighter blue red) (list green white))
=> (rgb-brightest-helper red (list green white))
=> (rgb-brightest-helper (rgb-brighter red green) (list white))
=> (rgb-brightest-helper green (list white))
=> (rgb-brightest-helper (rgb-brighter green white) (list))
=> (rgb-brightest-helper white (list))
=> white

Not bad. About five recursive calls (plus four calls to rgb-brighter). And, the process looks clear enough that we can be fairly confident that with a list of size ten, this will take only ten recursive calls (plus nine calls to rgb-brighter).

Of course, for this new version to work, we need to have rgb-brighter defined. But that is straightforward. (You may even have it in your library already.)

;;; Procedure:
;;;   rgb-brighter
;;; Parameters:
;;;   color1, an RGB color.
;;;   color2, an RGB color.
;;; Purpose:
;;;   Find the brighter of color1 and color2.
;;; Produces:
;;;   brighter, an RGB color.
;;; Preconditions:
;;;   [No additional]
;;; Postconditions:
;;;   brighter is either color1 or color2
;;;   (rgb-brightness brighter) >= (rgb-brightness color1)
;;;   (rgb-brightness brighter) >= (rgb-brightness color2)
(define rgb-brighter
  (lambda (color1 color2)
    (if (>= (rgb-brightness color1) (rgb-brightness color2))
        color1
        color2)))

Of course, once we've defined rgb-brighter, we might want to revisit the design of our first recursive version. You may recall that we'd gotten to this following stage in our design:

(define rgb-brightest
  (lambda (colors)
    (if (null? (cdr colors))
        (car colors)
        recursive-case)))

Now, what goes in the recursive case? We said “Suppose we compute the brightest color in the remainder of the list ... the brightest color is either the brightest color in the remainder or the first color, whichever is brighter.” We then expressed that decision as a conditional. Now we can express it using rgb-brighter.

(define rgb-brightest
  (lambda (colors)
    (if (null? (cdr colors))
        (car colors)
        (rgb-brighter (car colors)
                      (rgb-brightest (cdr colors))))))

Now, let's trace what this procedure does on the same list.

   (rgb-brightest (list black blue red green white))
=> (rgb-brighter black (rgb-brightest (list blue red green white)))
=> (rgb-brighter black (rgb-brighter blue (rgb-brightest (list red green white))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red (rgb-brightest (list green white)))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red (rgb-brighter green (rgb-brightest (list white))))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red (rgb-brighter green white))))
=> (rgb-brighter black (rgb-brighter blue (rgb-brighter red white)))
=> (rgb-brighter black (rgb-brighter blue white))
=> (rgb-brighter black white)
=> white

In terms of computation, this seems about the same as the second helper version: We did five recursive calls and four calls to rgb-brighter. However, we see that the calls to rgb-brighter get stacked up, which makes it a bit more difficult for the interpreter.

What should you take away from this discussion? A number of things. First, if a recursive procedure has more than one recursive call, it is likely to be absurdly slow. (Later in the semester, we'll see some times in which multiple recursive calls are okay. However, such situations are rare.) Second, as you've already learned, there is often more than one way to solve the same problem. We've seen five or so different solutions to the same problem. Third, once you've written a procedure, it can be useful to step back and think about how you can make it clearer. Finally, standard recursion tends to create a backlog of delayed actions, which are finally performed when the base case is reached. In contrast, tail recursion creates no such backlog.

Creative Commons License

Samuel A. Rebelsky, rebelsky@grinnell.edu

Copyright (c) 2007-8 Janet Davis, Matthew Kluber, and Samuel A. Rebelsky. (Selected materials copyright by John David Stone and Henry Walker and used by permission.)

This material is based upon work partially supported by the National Science Foundation under Grant No. CCLI-0633090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.