Keeping Track of Tasks with Linear Structures

Summary: We consider a few related data structures that are frequently used to collect values and provide value-by-value access to the collection. Together, these data structures are often called linear structures. Individually, they are called queues, stacks, and priority queues. We also consider a common application of these structures: keep track of tasks left to do.

Introduction: Deep Recursion, Revisited

As you may recall, one of the topics we've been considering recently deep recursion, a type of recursion in which each procedure call may make multiple recursive calls to itself. (In contrast, traditional linear recursion, each procedure call makes at most one direct recursive call.) Deep recursion is particularly useful for processing trees, since we can recurse on both the left and right subtrees. For example, we might compute the size of a tree with the following procedure:

;;; Procedure:
;;;   tree.size
;;; Parameters:
;;;   tree, a tree
;;; Purpose:
;;;   Count the number of values in tree.
;;; Produces:
;;;   count, a non-negative integer
;;; Preconditions:
;;;   (none)
;;; Postconditions:
;;;   count is the number of values in tree.  (pairs are not considered
;;;   values but null is considered a value).
(define tree.size
  (lambda (tree)
    (if (pair? tree)
        (+ (tree.size (car tree))
           (tree.size (cdr tree)))
        1)))

In this case, there are either zero or two recursive calls per node.

We've also used deep recursion to generate fractals, as in the fractal-rectangle procedure. A simplified version of that procedure follows.

(define simple-fractal-rectangle!
  (lambda (image color left top right bottom level)
    (cond
      ((= level 0)
       (envt.set-fgcolor! color)
       (image.select-rectangle! image selection.replace
                                left top 
                                (- right left)
                                (- bottom top))
       (image.fill! image)
       (image.select-nothing! image)
       (envt.update-displays!))
      (else
       (let* ((midcol (round (* 0.5 (+ left right))))
              (midrow (round (* 0.5 (+ bottom top)))))
         (simple-fractal-rectangle! image
                                    color
                                    left top midcol midrow
                                    (- level 1))
         (simple-fractal-rectangle! image
                                    color
                                    midcol top right midrow
                                    (- level 1))
         (simple-fractal-rectangle! image
                                    color
                                    left midrow midcol bottom
                                    (- level 1))
         (simple-fractal-rectangle! image
                                    color
                                    midcol midrow right bottom
                                    (- level 1))
         )))))

In this case, there are either zero or four recursive calls per procedure call.

One of the issues associated with deep-recursive procedures, like fractal-rectangle, is that there are lot of hidden delayed procedure calls built up somewhere. For example, while we're in the middle of drawing the upper-left corner of a rectangle, we have some recursive calls built up for the rest of that smaller rectangle and some recursive calls built up for the rest of the rest of the bigger rectangle (and, if we have multiple levels of fractal recursion, even more recursive calls built up for intermediate rectangles). However, programmers do not have access to these recursive calls - once they've been set up, they will happen at some time determined by the computer (and by the structure of the program.

Because the programmer does not have access to those delayed calls, it is not possible to modify them (say, to stop drawing midway through or to de-prioritize some activities) or even to check what is happening. As an alternate, some programmers make “the remaining things to do” an explicit parameter to the procedure, rather than implicit from the call structure of the procedure.

In effect, the programmer needs a structure that represents collections of tasks left do. Instead of making lots of recursive calls, a procedure will simply add tasks to the collection. When done with one call, the program will grab the next task and continue. We're done when there are no tasks left. Collections that provide the operations of “add an element”, “get the next element”, and “remove that element” are called linear structures. Interestingly, the policy by which we get the next elements (either to use or remove) can have a significant effect on the behavior of the procedure. We'll explore that behavior a bit in this reading and the corresponding lab.

Primary Procedures

In many ways, these linear structures, used to collect tasks, are like the to-do lists that many of us maintain: We add tasks, we remove tasks as we do them, we check if we have tasks left. The big difference is that we'll be a little more structured in the access we provide to tasks. Let's first describe what we want the procedures to do and then come back to the implementation a bit later.

One of our first decisions in thinking about collections of tasks is whether they should be mutable (as is the case with vectors) or whether we build new collections from old, by adding or removing elements (as is the case with lists). The common custom in Scheme programming is to avoid mutation as much as possible, so we will make collections non-mutable.

Of course, that decision has some implications for how we remove elements. On paper, we might simultaneously find and remove the next task. In code, it's useful to separate those activities. Why? Well, consider what happens if we grab and remove a tasks. In effect, we'll need that procedure to return two values: the next task and the modified collection of tasks. Although it is possible to have a procedure return to values, it is simpler to separate the activities.

Given those decision, there are five basic operations we can use for manipulating collections of tasks: create a new collection of tasks, add an element to the collection, get the next element to process, remove that element, and check whether or not the collection contains any more tasks to process.

Since these collections have arbitrary size, the procedure to create new collections needs no parameters.

;;; Procedure: 
;;;   collection.new
;;; Parameters: 
;;;   (none)
;;; Purpose:
;;;   Create a new linear collection of values
;;; Produces:
;;;   values, a linear collection of values
;;; Preconditions:
;;;   [None]
;;; Postconditions:
;;;   values is eligible for use by the collection.get, collection.drop,
;;;     collection.add, and collection.empty? procedures.

To get the next element from the collection, we need only the collection.

;;; Procedure:
;;;   collection.get
;;; Parameters:
;;;   stuff, a collection
;;; Purpose:
;;;   Gets the next value in stuff.
;;; Produces:
;;;   val, a Scheme valule
;;; Preconditions:
;;;   stuff was created by starting with a new collection (with
;;;     collection.new) and then modifying it with collection.add
;;;     and collection.drop.
;;;   stuff is nonempty.
;;; Postconditions:
;;;   val was previously added to this collection with collection.add
;;;   val has not been removed from the collection with collection.add

Note that different collections may have different policies for getting a value, so we do not specify the policy in this documentation. Why might different collections have different policies? Well, consider a typical task list. Some people do tasks in the same order in which they are added. Others associate priorities with their tasks. Others choose tasks randomly. Still others have more complex policies. We'll revisit those policies in a subsquent section. For now, we simply leave the particular decision of what element to get unspecified. We'll leave the same issue unspecified in collection.drop.

;;; Procedure:
;;;   collection.drop
;;; Parameters:
;;;   stuff, a collection
;;; Purpose:
;;;   Drops the next value in stuff.
;;; Produces:
;;;   newstuff, a collection
;;; Preconditions:
;;;   stuff was created by starting with a new collection (with
;;;     collection.new) and then modifying it with collection.add
;;;     and collection.drop.
;;;   stuff is nonempty.
;;; Postconditions:
;;;   newstuff contains one element less than stuff.
;;;   The element that was lost is the one that would have been returned 
;;;     by (collection.get stuff)

Since some people prioritize their tasks, we'll include a priority in the collection.add procedure. In many cases, we'll ignore that priority. However, it's easiest to include it for the few cases in which we need it.

;;; Procedure:
;;;   collection.add
;;; Parameters:
;;;   stuff, a collection of values
;;;   val, the value to add
;;;   priority, the numeric priority for val
;;;    [ignored in some collections]
;;; Purpose:
;;;   Add val to stuff.
;;; Produces:
;;;   newstuff, a collection of values.
;;; Preconditions:
;;;   collection is a valid collection.
;;; Postconditions:
;;;   newstuff contains one more copy of val than stuff.
;;;   For every value x not equal to val, newstuff contains the
;;;     same number of copies as stuff.

Finally, we'll need a way to make sure that there are tasks left to process.

;;; Procedure:
;;;   collection.empty?
;;; Parameters:
;;;   stuff, a collection
;;; Purpose:
;;;   Determines whether or not stuff is empty.
;;; Produces:
;;;   empty?, a Boolean
;;; Preconditions:
;;;   stuff is a valid collection.
;;; Postconditions:
;;;   If stuff was built by equal numbers of calls to 
;;;     collection.add and collection.drop, then empty? is true.
;;;   If stuff was built by more calls to collection.add than 
;;;     calls to collection.drop, then empty? is false.
;;;   Otherwise, stuff is invalid.

Using Linear Structures

Let's consider how we could use these techniques to make the collected recursive calls in simple-fractal-rectangle! explicit, rather than implicit. We'll explicitly keep track of a collection of the sub-rectangles left to draw. We can represent each sub-rectangle left to draw. draw as a six-element vector (color, left, top, right, bottom, level).

Now, what do we do with that collection? We repeatedly grab the first rectangle and process it as appropriate (sometimes adding more rectangles to draw). For clarity, we'll use separate processes to do the repetition (kernel) and the processing of individual rectangles (process-rect). The kernel is relatively straightforward.

  (letrec ((kernel
             (lambda (tasks)
               (if (collection.empty? tasks)
                   'Success
                   (kernel (process-rect (collection.get tasks)
                                         (collection.drop tasks)))))))
    ...)

What should process-rect look like? For the base case (level is 0), we just draw the rectangle. For the recursive case, we need to build four tasks (well, descriptions of rectangles) and add each to the collection.

                  (let ((midcol (round (* 0.5 (+ right left))))
                        (midrow (round (* 0.5 (+ bottom top))))
                        (nextlevel (- level 1)))
                    (let* ((rect1 (vector (rgb.lighter color)
                                          left top midcol midrow (- level 1)))
                           (rect2 (vector (rgb.darker color)
                                          midcol top right midrow (- level 1)))
                           (rect3 (vector (rgb.darker color)
                                          left midrow midcol bottom (- level 1)))
                           (rect4 (vector color
                                          midcol midrow right bottom (- level 1)))
                           (c1 (collection.add remaining-tasks rect1 0))
                           (c2 (collection.add c1 rect2 0))
                           (c3 (collection.add c2 rect3 0))
                           (c4 (collection.add c3 rect4 0)))

All that's left is to decide what rectangle the collection should start with. That's easy enough: We start with a collection that contains the whole rectangle. Putting it all together, we get

(define simple-fractal-rectangle!
  (lambda (image color left top right bottom level)
    (letrec
      ; Process the task at the front of the collection,
      ; returning the modified collection.
      ((process-rect
         (lambda (rect remaining-tasks)
             ; Note: Each task is a vector of the form
             ;   #(color left top right bottom level)
             (let ((color (vector-ref rect 0))
                   (left (vector-ref rect 1))
                   (top (vector-ref rect 2))
                   (right (vector-ref rect 3))
                   (bottom (vector-ref rect 4))
                   (level (vector-ref rect 5)))
               ; Draw the rectangle (we may draw over it)
               (envt.set-fgcolor! color)
               (image.select-rectangle! image selection.replace
                                        left top
                                        (- right left)
                                        (- bottom top))
               (image.fill! image)
               (image.select-nothing! image)
               (envt.update-displays!)
               (cond
                 ; Base case: We're done, so return the remaining tasks
                 ((= level 0)
                  remaining-tasks)
                 ; Otherwise, add the remaining tasks.
                 (else
                  (let ((midcol (round (* 0.5 (+ left right))))
                        (midrow (round (* 0.5 (+ bottom top))))
                        (nextlevel (- level 1)))
                    (display (list 'midcol midcol 'midrow midrow)) (newline)
                    (let* ((rect1 (vector (rgb.lighter color)
                                          left top midcol midrow (- level 1)))
                           (rect2 (vector (rgb.darker color)
                                          midcol top right midrow (- level 1)))
                           (rect3 (vector (rgb.darker color)
                                          left midrow midcol bottom (- level 1)))
                           (rect4 (vector color
                                          midcol midrow right bottom (- level 1)))
                           (c1 (collection.add remaining-tasks rect1 0))
                           (c2 (collection.add c1 rect2 0))
                           (c3 (collection.add c2 rect3 0))
                           (c4 (collection.add c3 rect4 0)))
                        c4)))))))
       (kernel
         (lambda (tasks)
           ; Uncomment the following line to keep track of the tasks
           ; left to do.
           (display tasks) (newline)
           (if (collection.empty? tasks)
               'Success
               (kernel (process-rect (collection.get tasks)
                                     (collection.drop tasks)))))))
      (kernel (collection.add (collection.new)
                              (vector color left top right bottom level)
                              0)))))

You'll explore this version of the procedure in the lab.

Policies for Getting Values

As we mentioned earlier, there are many different policies that can be used for selecting the next value to process. One of the most common policies is first-in, first-out (FIFO). In this policy, the value returned by collection.get is the first of the remaining values. That is, if we add A to the collection before B, we process A before B. Linear structures that implement a FIFO policy are typically called queues

Another common strategy is to assign a numeric priority to each value, and to remove values according to their priority. (Lower numbers represent higher priorities.) Linear structures that implement a priority policiy are typically called priority queues.

You'll note that neither of these is the strategy that Scheme seems to use when processing procedures. That is, if you've set up six recursive calls, and the first of those recursive calls does four more recursive calls, those newer calls are done before the other calls, rather than after (as in FIFO). There are also strategies that humans use that don't match the other two. For example, some people always put the mail they've just received on the top of their inbox, and process their inbox from top to bottom. In both cases, the strategy seems to be much more one of last in, first out (LIFO). Linear structures that implement a LIFO policy are typically called stacks

Queues, priority queues, and stacks are the three most common linear structures. We implement each in different ways.

Implementing Linear Structures as Stacks

We'll consider only stacks, and only one implementation of stacks. When implementing these kinds of collections, we might use lists, trees, or vectors. Since we're building non-mutable structures, and accessing the values in those structures one at a time, it seems that lists are the easiest possible underlying structure.

A new collection is simply the empty list. To add something to the collection, we cons it onto the front (ignoring the priority. The next thing to process is therefore the car. Similarly, to delete something, we just take the cdr. And empty lists represent empty collections. Here's the code that puts all of those ideas together.

(define stack.new
  (lambda ()
    null))

(define stack.get
  (lambda (stuff)
    (if (null? stuff)
        (throw "Cannot get values from empty stacks.")
        (car stuff))))

(define stack.drop
  (lambda (stuff)
    (if (null? stuff)
        (throw "Cannot remove values from empty stacks.")
        (cdr stuff))))

(define stack.add
  (lambda (stuff val prio)
    (cons val stuff)))

(define stack.empty?
  (lambda (stuff)
    (null? stuff)))

Selecting an Implementation

The implementations of queues and priority queues is a bit more complex, so we will leave them until later. Suppose we've implemented stacks, queues, and priority queues. How do we decide which our program uses? If a procedure requires a particular kind of linear structure, we use those names (e.g., stack.add) rather than the generic names (e.g., collection.add). If we want to use the generic names, so that we can experiment with the strengths and weaknesses of various implementations, we can simply tell Scheme to treat things as equivalent. For example, if we want to use the stack implementation, we might write

(define collection.new stack.new)
(define collection.get stack.get)
(define collection.drop stack.drop)
(define collection.add stack.add)
(define collection.empty? stack.empty?)

In the lab, you'll use this technique to explore each kind of linear structure.