Trees

Summary: In our exploration of pair structures, we noted that you can build structures using cons that are not always lists. We typically call these structures trees. Trees are an important dynamic data structure used in a variety of problems.

In this reading, we explore more about trees, including some terminology for talking about trees, some formal for describing trees, and some techniques for recursing over trees (using what is often called deep recursion).

Number Trees, Revisited

Recall the following structure from the reading on pairs and pair structures.

Computer scientists typically call such a structure a tree (even though it looks more like the roots of a tree than the tree). In this case, because all of the values in the tree are numbers, we call it a number tree or a tree of numbers.

There are a number of important characteristics we associate with trees, including

the root of a tree, which refers to the top of the tree;
the size of a tree, the number of values stored in the tree (sometimes the size also includes the number of pairs);
the depth of a tree, the length of the longest path from the top of the tree to the furthest value; and
the base type of the tree, such as “number” in number trees, which indicates what kind of values the tree is built from. We call trees that have no single base type heterogeneous trees.

At any point in the tree built with a pair, we usually refer to the two parts below it as the left subtree (the car) and the right subtree (the cdr). We refer to each pair in a tree as a node and each non-pair value as a leaf.

Formalizing Trees

We have an informal definition of trees, given by the figure above. Can we define them formally? Yes. Just as we can define lists recursively, so can we define trees recursively. As you may recall, the definition of a list goes something like the following:

The empty list (null) is a list.
If lst is a list and val is any Scheme value, then (cons val lst) is a list.
Nothing else is a list.

We do we call this a “recursive definition”? Because the definition of a list is in terms of lists (at least in the middle item). We can write a similarly recursive definition for trees.

Any value is a tree.
If t1 and t2 are trees, then so is (cons t1 t2).
Nothing else is a tree.

We can also make this definition a little more specific, so that we can speak of things like “color trees”.

Any color is a color tree.
If t1 and t2 are color trees, then so is (cons t1 t2).
Nothing else is a color tree.

Patterns of Tree Recursion

As you should recall from our initial explorations of recursion, there is a traditional pattern for recursion over lists:

(define recursive-proc
  (lambda (lst other)
      (if (null? lst)
          (base-case other)
          (combine other
                   (car lst)
                   (recursive-proc (cdr lst) other)))))

We chose this pattern because of the common definition of a list. Because a list is either null or the cons of a value and a list we have two cases: one for when the list is null and one for the cons case. Since the cdr of a list is itself a list, in makes sense to recurse on the cdr.

A tree, in comparison, is either a value or a pair of trees. Our base case is when we do not have a pair. Since both parts of the pair are trees, we should therefore recurse on both halves, rather than just the cdr. Hence, we define the common pattern for recursion over trees as follows.

(define recursive-proc
  (lambda (tree)
    (if (pair? tree)
        (combine (recursive-proc (car tree))
                 (recursive-proc (cdr tree)))
        (base-case tree))))

For example, if we know that the tree contains only numbers, we can build sum-of-number-tree by using + to combine the recursive calls and simply return tree for the base case. (We'll use ntree instead of tree to remind ourselves that we're working with a tree of numbers.)

;;; Procedure:
;;;   sum-of-number-tree
;;; Parameters:
;;;   ntree, a number tree
;;; Purpose:
;;;   Sums all the numbers in ntree.
;;; Produces:
;;;   sum, a number
;;; Preconditions:
;;;   ntree is a number tree.  That is, it consists only of numbers
;;;   and cons cells.
;;; Postconditions:
;;;   sum is the sum of all numbers in ntree.
(define sum-of-number-tree
  (lambda (ntree)
    (if (pair? ntree)
        (+ (sum-of-number-tree (car ntree))
           (sum-of-number-tree (cdr ntree)))
        ntree)))

We can also use this pattern to find the depth of a tree. In this case, the depth of a tree that contains only one value is 0, and the depth of any other tree is one higher than the depth of its largest subtree.

;;; Procedure:
;;;   tree-depth
;;; Parameters:
;;;   tree, a tree
;;; Purpose:
;;;   Computes the length of the longest path from the start of the tree
;;;   to the furthest value.
;;; Produces:
;;;   dep, a nonnegative integer
;;; Preconditions:
;;;   None.
;;; Postconditions:
;;;   If tree contains no pairs, then dep is 0.
;;;   Otherwise, dep is one more than the depth of the deepest subtree.
(define tree-depth
  (lambda (tree)
    (if (pair? tree)
        (+ 1 (max (tree-depth (car tree))
                  (tree-depth (cdr tree))))
        0)))

For example,

> (tree-depth 1)
0
> (tree-depth (cons 1 2))
1
> (tree-depth (cons 1 (cons 2 3)))
2
> (tree-depth (cons (cons 1 2) 3))
2
> (tree-depth (cons (cons 1 2) (cons 3 4)))
2
> (tree-depth (cons (cons 1 (cons 2 3)) (cons 4 5)))
3

We can also use the pattern to find the number of values in a tree. In this case, if we have a non-tree, there's only one value. Otherwise, we need to combine the number of values in the left and right subtrees.

;;; Procedure:
;;;   tree-size
;;; Parameters:
;;;   tree, a tree
;;; Purpose:
;;;   Count the number of values in tree.
;;; Produces:
;;;   count, a non-negative integer
;;; Preconditions:
;;;   (none)
;;; Postconditions:
;;;   count is the number of values in tree.  (pairs are not considered
;;;   values but null is considered a value).
(define tree-size
  (lambda (tree)
    (if (pair? tree)
        (+ (tree-size (car tree))
           (tree-size (cdr tree)))
        1)))

For example,

> (tree-size 1)
1
> (tree-size (cons 1 2))
2
> (tree-size (cons 1 (cons 2 3)))
3
> (tree-size (cons (cons 1 2) 3))
3
> (tree-size (cons (cons 1 2) (cons 3 4)))
4

We can even use this pattern to “flatten” the tree into a list with the same values, in the order they appear from left-to-right. In this case, we turn non-paired values into lists, and append the results of flattening subtrees.

;;; Procedure:
;;;   tree-flatten
;;; Parameters:
;;;   tree, a tree
;;; Purpose:
;;;   Convert a tree structure into a similar list structure.
;;; Produces:
;;;   lst, a list
;;; Preconditions:
;;;   none
;;; Postconditions:
;;;   lst is a simple list (that is, it contains no sublists).
;;;   Each value that appears in lst appears in tree.
;;;   Each value that appears in tree appears in lst.
;;;   If a precedes b in tree, then a precedes b in lst.
(define tree-flatten
  (lambda (tree)
    (if (pair? tree)
        (append (tree-flatten (car tree))
                (tree-flatten (cdr tree)))
        (list tree))))

For example,

> (tree-flatten 1)
(1)
> (tree-flatten (cons 1 2))
(1 2)
> (tree-flatten (cons 1 (cons 2 3)))
(1 2 3)
> (tree-flatten (cons (cons 1 2) (cons 3 4)))
(1 2 3 4)
> (tree-flatten (cons (cons 1 (cons 2 (cons 3 4))) (cons 5 (cons 6 7))))
(1 2 3 4 5 6 7)

Trees vs. Nested Lists

As you may recall, a simple list (such as a list of numbers) is simply a tree in which all the left subtrees (the car elements) are values and the final right subtree is null.

When we nest lists (that is, make lists elements of lists), we build structures that are very much like trees, except that we maintain the limitation that the rightmost subtree at every stage must be null.

For example, consider the list (1 (2 3) (4 (5) (6 7)) (8 9)). It has four elements (the 1, the (2 3), the (4 (5) (6 7)) and the (8 9)), all but the first of which are themselves lists. We might consider it a number tree, except for the problem that the non-pair values include not just numbers, but also a variety of nulls: at the end of the top-level list, at the end of the list (2 3), at the end of the next list and each of its sublists, and so on and so forth. Hence, if we try to apply the sum-of-number-tree procedure, it will try to add these null values and therefore stop with an error. We get similar problems when we try to use procedures like tree-flatten and tree-size.

> (define example (list 1 (list 2 3) (list 4 (list 5) (list 6 7)) (list 8 9)))
> example
(1 (2 3) (4 (5) (6 7)) (8 9))
> (tree-flatten example)
(1 2 3 () 4 5 () 6 7 () () 8 9 () ())
> (tree-size example)
15

What do we do? We use a similar technique for nested list structures that we use for trees, except that we incorporate the chance that a value is null. For the case of sum, when we hit null, we return 0.

(define sum-of-nested-number-lists
  (lambda (lst)
    (cond
      ((null? lst) 0)
      ((pair? lst)
       (+ (sum-of-nested-number-lists (car lst))
          (sum-of-nested-number-lists (cdr lst))))
      (else lst))))

Here are some examples that contrast the behavior of sum-of-nested-number-lists with sum-of-number-tree and the old sum procedure we wrote when first exploring lists.

> (sum 1)
car: expects argument of type <pair>; given 1
> (sum-of-number-tree 1)
1
> (sum-of-nested-number-lists 1)
1
> (sum (list 1 2 3))
6
> (sum-of-number-tree (list 1 2 3))
+: expects type <number> as 2nd argument, given: (); other arguments were: 3
> (sum-of-nested-number-lists (list 1 2 3))
6
> (sum (list (list 1 2 3)))
+: expects type <number> as 1st argument, given: (1 2 3); other arguments were: 0
> (sum-of-nested-number-lists (list (list 1 2 3)))
6
> (define example (list 1 (list 2 3) (list 4 (list 5) (list 6 7))))
> example
(1 (2 3) (4 (5) (6 7)))
> (sum example)
+: expects type <number> as 1st argument, given: (4 (5) (6 7)); other arguments were: 0
> (sum-of-number-tree example)
+: expects type <number> as 2nd argument, given: (); other arguments were: 3
> (sum-of-nested-number-lists example)
28

Using this particular procedure as an example, we can also suggest a typical form for procedures that recurse over nested lists.

(define recursive-proc
  (lambda (val)
    (cond
      ((pair? val)
       (combine (recursive-proc (car val))
                (recursive-proc (cdr val))))
      ((null? val) null-case)
      (else (base-case val)))))

We might use this form to tally the number of times a colors appears in a nested list structure (this is much like our count of the number of values in a tree, except we also check the type of the values we find). In this case, we return 0 for the null case and 1 or 0 for the base case (depending on whether val is a color or not).

;;; Procedure:
;;;   tally-colors
;;; Parameters:
;;;   lst, a nested list structure
;;; Purpose: 
;;;   Count the number of colors that appear in the structure.
;;; Produces:
;;;   color-tally, a nonnegative integer
;;; Preconditions:
;;;   (none)
;;; Postconditions:
;;;   color-tally is the number of colors at all levels of lst.
(define tally-colors
  (lambda (lst)
    (cond
      ((pair? lst)
       (+ (tally-colors (car lst))
          (tally-colors (cdr lst))))
      ((null? lst) 0)
      ((rgb? lst) 1)
      (else 0))))

We can also tally the number of times a particular color appears by adding another parameter (the color) and changing the extra test.

;;; Procedure:
;;;   tally-color
;;; Parameters:
;;;   lst, a nested list structure
;;;   color, an RGB color
;;; Purpose:
;;;   Count the number of times the given color appears in the structure.
;;; Produces:
;;;   color-tally, a nonnegative integer
;;; Preconditions:
;;;   (none)
;;; Postconditions:
;;;   color-tally is the number of occurences of color at all levels of lst.
(define tally-color
  (lambda (lst color)
    (cond
      ((pair? lst)
       (+ (tally-symbol (car lst) color)
          (tally-symbol (cdr lst) color)))
      ((null? lst) 0)
      ((equal? lst color) 1)
      (else 0))))

And, as these examples suggest, we can write a better version of flatten that works for both trees and nested lists.

(define flatten
  (lambda (tree)
    (cond
      ((pair? tree) 
       (append (flatten (car tree))
               (flatten (cdr tree))))
      ((null? tree) null)
      (else (list tree)))))

As the following examples suggest, flatten works for trees, nested lists, and even simple values.

> (define example (list 1 (list 2 3) (list 4 (list 5) (list 6 7)) (list 8 9)))
> example
(1 (2 3) (4 (5) (6 7)) (8 9))
> (tree-flatten example)
(1 2 3 () 4 5 () 6 7 () () 8 9 () ())
> (flatten example)
(1 2 3 4 5 6 7 8 9)
> (tree-flatten (quote a))
(a)
> (flatten (quote a))
(a)
> (tree-flatten null)
(())
> (flatten null)
()

An Image Application: Color Trees

Are trees useful for drawings? Certainly. For one thing, it can be fun to simply draw trees. However, trees are also used to represent images. For example, we might break a large image up into smaller rectangular sections, and describe it as a tree of sections. Why sections? Because we might be able to choose a different palette for each section, or perhaps even choose a single color for a section. These trees are typically called quad trees and are beyond the initial scope of this course.

Don't despair! We can still draw some fairly interesting images using color trees. Here's one approach: Given a region of an image in which to draw, and a color tree to draw in that region, either (a) fill the region with a single color, if the tree represents a single color or (b) split the region in half, and draw each half of the subtree in half of the image.

We might encode that strategy as follows.

;;; Procedure:
;;;   render-color-tree!
;;; Parameters:
;;;   ctree, a tree of colors
;;;   image, an image 
;;;   left, an integer
;;;   top, an integer
;;;   width, an integer
;;;   height, an integer
;;; Purpose:
;;;   Render the tree into the portion of the image bounded at
;;;   the left by left, at the top by top, and with the specified
;;;   width and height.
;;; Produces:
;;;   [Nothing; Called for the side effect.]
;;; Preconditions:
;;;   [The usual]
;;; Postconditions:
;;;   The tree has now been rendered.
(define render-color-tree!
  (lambda (ctree image left top width height)
    (cond
      ((pair? ctree)
       (render-color-tree! (car ctree) image 
                                       left top 
                                       (/ width 2) height)
       (render-color-tree! (cdr ctree) image
                                       (+ left (/ width 2)) top
                                       (/ width 2) height))
      (else
       (image-select-rectangle! image REPLACE
                                left top width height)
       (context-set-fgcolor! ctree)
       (image-fill-selection! image)
       (context-update-displays!)
       (image-select-nothing! image)))))

You will have an opportunity to explore this procedure (and some variations) in the corresponding laboratory. You may also have the opportunity for further explorations in the near future.