Mini-Project 5: Matching, binding, and evaluation

Summary
In this mini-project, you will reflect on pattern matching, developing your own simplified pattern matching procedure.
Collaboration
Each student should submit their own set of solutions to this project. You may certainly consult others in developing the project, although you should cite or acknowledge them if you have done so.

Congratulations! You get to try a brand new assignment (based on an earlier assignment).

For this assignment, store your code in the file match-bind-eval.rkt.

Please do not include any of the tests we’ve written, but feel free to include your own. (You should, of course, run any of the tests we’ve written. But elide them afterwards.)


As you have seen in your explorations of the match keyword, being able to compare (match) one thing to another can prove very helpful in program design. It can also be helpful in analyzing texts and other structures. But how does matching work, particularly when it involves bindings? There are a number of strategies. We will consider some in this assignment, in addition to the use of bindings in evaluating expressions.

Patterns and bindings

What is a “pattern”? No, it’s not a regular expression, at least not in this context. For this assignment, a pattern is a structured value (that is, a list of values) that we will compare to other structured values. How does a pattern differ from other lists? It may include symbols, which, in this assignment, we will not permit in “normal” values.

For example, '(a 2 b) is a pattern that matches lists of three elements, the middle of which is 2. That pattern would match '(1 2 3) or even '("one" 2 "three"). For this assignment, the only non-list values we will permit are strings and integers.

What is a “binding”? For this assignment, a binding is a two-element list, the first of which is a symbol and the second of which is a value (a string, integer, or list). For example, '(a 1) represents the binding of the value 1 to a.

When pattern matching, we generally create “binding lists”, which are what they sound like: a list of individual bindings. For example, when matching the pattern '(a 2 b) and the value '(1 2 3), we create the binding list '((a 1) (b 3)) or the binding list ((b 3) (a 1)). The order should not matter. When matching the pattern '(1 2 3) and the value '(1 2 3), we create the binding list '(), since there are no variables to bind.

What happens when the pattern and value don’t match, such as when the lists are different lengths or individual elements don’t match? For example, what happens when we try to match the pattern '(a 2 b) and the value '(1), or the pattern '(a 2 b) and the value '(1 1 1)? In these kinds of cases, we should indicate an invalid binding with the value #f.

Part one: Evaluating basic expressions

As you may recall from our mental model of Scheme, define statements effectively create a table of bindings. Our binding lists can be used as one of those tables.

Document, write tests for, and implement a procedure, (simple-evaluate-base expression binding-list), which takes a simple expression and evaluates it according to the binding list. For this part, you should assume that the expression is either (a) an integer, in which case you should just return the integer, or (b) a symbol, in which case you should look it up in the binding list and return its value.

Document, write tests for, and implement a procedure, (simple-evaluate-list expression binding-list), which takes a list of the form (op e1 e2 ...) and evaluates each of the subexpressions (each of which is either a symbol or an integer) and then applies the operation, which is either mul (for multiplication) or add (for addition).

Document, write tests for, and implement a procedure, (simple-evaluate expression binding-list), which combines the behavior of (simple-evaluate-base expression binding-list) and (simple-evaluate-list expression binding-list). That is, it should accept and integer, a symbol, or a list of integers and symbols prefaced by an operation, and follow the evaluation described above.

> (simple-evaluate-base 'b '((a 2) (b 1)))
1
> (simple-evaluate-base 5 '((a 2) (b 1)))
5
> (simple-evaluate-list '(add b 5) '((a 2) (b 1)))
6
> (simple-evaluate-list '(mul a a a) '((a 2) (b 1)))
8
> (simple-evaluate-list '(add 1 a 2 b 3 c 4 d) '((c 5) (a 6) (b 7) (d 8)))
36
> (simple-evaluate 'b '((a 2) (b 1)))
1
> (simple-evaluate 5 '((a 2) (b 1)))
5
> (simple-evaluate '(add b 5) '((a 2) (b 1)))
6
> (simple-evaluate '(mul a a a) '((a 2) (b 1))
8
> (simple-evaluate '(add 1 a 2 b 3 c 4 d) '((c 5) (a 6) (b 7) (d 8)))
36

Your procedures can crash and burn if they encounter an invalid symbol or inputs other than those that are specified above.

Part two: Matching and binding in simple lists

Document, write tests for, and implement a procedure, (basic-match-list pattern value), that matches a pattern against a value. You may assume that the pattern is a list that contains only integers, strings, and symbols and that the value is a list that contains only strings and integers. You may also assume that no symbol appears more than once in the pattern. You should return false (#f) if the pattern and value do not match.

Part three: Better matching

Document, write tests for, and implement a procedure, (match-list pattern value), that behaves much like basic-match-list except that it permits patterns in which the same symbol can appear multiple times. In that case, you should verify that the two instances of the symbol match the same value. For example, '(b 1 b) matches '(2 1 2), but not '(2 1 3).

Part four: Evaluating nested expressions.

Document, write tests for, and implement a procedure, (evaluate expression binding-list) that takes an expression and a binding list as parameters and evaluates the expression using the binding list. In this case, the expression is either (a) an integer, (b) a symbol, or (c) a list of the form (op e1 e2 ...) where each subexpression is itself an integer, a symbol, or an expression (which may itself contain other expressions).

> (evaluate 'b
            '((a 2) (b 3)))
3
> (evaluate '(add a a)
            '((a 2) (b 3)))
4
> (evaluate '(mul (add a a) 5)
            '((a 2) (b 3)))
20
> (evaluate '(add b b)
            '((a 2) (b 3)))
6
> (evaluate '(add b (add b b))
            '((a 2) (b 3)))
9
> (evaluate '(add b (add b (add b b)) b)
            '((a 2) (b 3)))
15
> (evaluate '(add (add b (add b (add b b)) b) b)
            '((a 2) (b 3)))
18
> (evaluate '(mul (mul (add a a) 5) (add (add b (add b (add b b)) b) b))
            '((a 2) (b 3)))
360
> (evaluate '(mul (mul (add a a) 5) (add (add b (add b (add b b)) b) b))
            (match-list '(a b) '(2 3)))
360

Part five: Generalized matching

This part is necessary to earn an E but is *not* necessary for an M.

A lot of pattern matching happens in the context of so-called “hierarchical expressions” or “nested expressions”. You’ve already seen a lot of nesting in Racket, most frequently of expressions within expressions (within expressions). It turns out that we often think of texts that way: words are nested in sentences, sentences in paragraphs, paragraphs, in sections, sections in chapters, and so on and so forth.

For convenience, we are going to stick to the world of Racket and the kind of nesting that happens in Racket. We’ll limit our world of Racket to strings, integers, and lists. And our goal will be to match match Racket expressions to patterns that also involve strings, integers, and lists. However, patterns will also include symbols that represent variables.

Here are the basic rules:

  • A symbol pattern matches anything.
  • A constant pattern (integer or string) matches the same constant.
  • A list pattern matches a list value only if
    • The lengths of the two lists are the same, and
    • Each element in the pattern list matches the corresponding element of the value list.

Document, write tests for, and implement a procedure, (match-pattern pattern val), that matches a pattern against a value, returning a list of bindings. In this case, both the pattern and the value can be lists.

> (match-pattern '(q (a) ((b c) 2)) '((1) ("hello") ((1 1) 2)))
'((q (1)) (a "hello") (b 1) (c 1))
> (match-pattern '(q (a) ((b c) 2)) '((1) ("hello") ((1 1) 3)))
#f
> (match-pattern '(q (a) ((b c) 2)) '((1) "hello" ((1 1) 2)))
#f

You may assume that no symbol appears more than once in the pattern.

Rubric

Necessary for R

[ ] Includes `match-bind-eval.rkt`.
[ ] Runs when **Run** is clicked.
[ ] Code file includes header.
[ ] `simple-evaluate-base`, `simple-evaluate-list`, `simple-evaluate`, 
    and `basic-match-list` are included and documented.
[ ] `simple-evaluate` and `basic-match-list` work correctly.

Necessary for M

[ ] `match-list` is documented and included.
[ ] `match-list` works correctly.
[ ] `evaluate` is documented and included.
[ ] `evaluate` works correctly.
[ ] At least two base-case tests for each of the four primary procedures.
    (`simple-evaluate`, `basic-match-list`, `match-list`, and `evaluate`)
[ ] At least two edge-case tests for each of the four primary procedures.
    (`simple-evaluate`, `basic-match-list`, `match-list`, and `evaluate`)
[ ] Appropriate style (naming, indentation, etc.)
[ ] Avoids obvious repeated work.

Necessary for E

[ ] `match-pattern` is implemented and works correctly.
[ ] Documentation for all procedures, including helpers.
[ ] Tests for all procedures, including helpers.
[ ] Avoid duplication of expensive work in recursive calls, such as 
    checking the list length in each call.