# Class 20: Predictive Parsing (3)

Back to Pause for Breath. On to Shift-Reduce Parsing (1).

This outline is also available in PDF.

Held: Monday, 10 October 2011

Summary: Today we conclude our exploration of predictive parsing by considering the construction of the primary tables and functions.

Related Pages:

Notes:

• EC for the Thursday extra, by a visitor from UNL.
• EC for the Thursday's panel on the future of the Book, 4:15 in the gallery.
• If you have not done so already, start reading section 4.5.

Overview:

• Using Predictive Parsing Tables.
• Potential Problems with Predictive Parsing.
• Reflecting on Predictive Parsing.

• What we did last class:
```For all rules of the form m := Stuff n MoreStuff
Follow[n] := Follow[n] union first(MoreStuff)
```
• Our leftover question:
• Consider the rule `m := Stuff n`
• What does this tell us about Follow[m] or Follow[n]?
• A solution: Anything that follows the nonterminal on the left can follow the nonterminal on the right.
• That is, for rules of this form, we should say
```  Follow[n] := Follow[n] union Follow[m]
```
• Sketch of proof:
```something -> ...
-> BeforeM m AfterM
-> BeforeM Stuff n AfterM
```
• So, anthying that comes after m can come after m.
• More generally, what if we the stuff that follows m is nullable?
```m := Stuff n MoreStuff
// nullable (MoreStuff) -gt; TRUE
// Stuff and MoreStuff are sequences of nonterminals and
//      terminals
```
• A solution: Anything that follows the nonterminal on the left can follow the nonterminal on the right.
• Sketch of proof:
```something -> ...
-> BeforeM m AfterM
-> BeforeM Stuff n MoreStuff AfterM
-> BeforeM Stuff n AfterM
```

## Building the Predictive Parsing Table

• We're going to build a table that tells you, for every [nonterminal,terminal] pair, what rule to apply next.
• If a table cell is empty, it's an error.
• If a table cell contains a RHS, we match each element of the rhs side.
• If a table cell contains epsilon, we reduce immediately to epsilon
• Recall that the predictive parsing table tells you, for every nonterminal/terminal pair, what rule to apply next.
```for each production, pi of the form n ::= rhs
for each terminal, T, in first(rhs)
set Parse[n,T] to pi
if nullable(rhs)
for each terminal, T, in Follow[n]
set Parse[n,T] to pi
If you ever try to set an already-set position
report an error
```

## Some Grammars that Cause Problems

• As you may have noted in your own work (or your analysis of the code above), some grammars seem to be unamenable to predictive parsing.
• In particular, grammars in which two RHS's can begin with the same token cannot be predictively parsed.
```conditional : _IF test _THEN statement
| _IF test _THEN statement _ELSE statement
;
```
• This problem can usually be solved by left factoring the grammar:
• Identify the common initial sequence.
• Rewrite the grammar so that there is only one rule for the nonterminal that uses that common initial sequence which is then followed by a new nonterminal.
• That new nonterminal then represents all possible remainders.
```conditional : _IF test _THEN statement optional_else
;
optional_else : /* epsilon */
| _ELSE statement
;
```
• It turns out that left recursive grammars (grammars in which a nonterminal can derive a sequence that starts with that nonterminal) also cannot be predictively parsed.
```lines : lines line
| line
;
```
• Whatever starts a line can start a lines.
• So which do we choose?
• This problem can be fixed by making the grammar right recursive.
```lines : line lines
| line
;
```
• Overlapping, so
```lines : line morelines
;
morelines : lines
| /* epsilon */
;
```
• Warning! We won't be doing this with Yacc/Bison, since Yacc/Bison use a different approach.

## Predictive Problems, Revisited

• Many grammars are not amenable to predictive parsing.
• Some are for languages not amenable to predictive parsing.
• Some must be mangled to permit predictive parsing.
• Neither situation is useful.
• With careful programming, we can handle use grammars to generate unmangled parse trees.
• With more lookahead, we can often do better (at keeping the original grammar and at accepting a greater range of languages).
• However, it is also instructive to look for other techniques, which we will start in our next class.
• Question: What do you see as the potential difficulties in writing a predictive parser for Pascal?

## Looking Ahead to Shift-Reduce Parsing

• The main problem we have right now is that when we see a symbol, we might not be sure which rule applies until we read some more stuff.
• And we might need to read a lot more stuff.
• Predictive parsing insists that we make the decision immediately upon seeing a symbol.
• An alternative strategy, shift-reduce parsing, lets us delay reduction.
• In shift-reduce parsing, we keep a stack of unprocessed stuff.
• The stuff may include terminals that we've put aside until we know which rule the participate in.
• The stuff may also include nonterminals. And, again, we may not know what rule they participate in.
• When we see a token, we have two basic choices:
• We can shift it onto the stack, noting that we are not yet sure what role it serves.
• We can reduce what is on top of the stack, applying a production from right to left. (That is, converting a sequence of terminals and nonterminals into a nonterminal.)

Back to Pause for Breath. On to Shift-Reduce Parsing (1).

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Wed Dec 7 10:26:33 2011.
The source to the document was last modified on Fri Aug 26 13:03:12 2011.
This document may be found at `http://www.cs.grinnell.edu/~rebelsky/Courses/CS362/2011F/Outlines/outline.20.html`.

Samuel A. Rebelsky, rebelsky@grinnell.edu