CSC362 2004S, Class 17: Predictive Parsing (4): Wrapup
Admin:
* The late list returns: Sam, Piper (Grrr!!!), Ananta, Jonathan (???)
* Questions on your parsers?
* Does Sam have an attendace policy?
* Yes! Attend or else. (Please email me when you can't make it to class.)
* Wednesday's "Talk" at 4:30: A CS video: E.C. for attending and sending me a reflective comment
Overview:
* Why we need a Follow table
* Building the Follow table
* Building the parse table
* Typical grammar problems:
* Similar right-hand sides
* Left recursion
* Correcting those problems
/Sample Grammars/
Grammar One:
R1: s ::= b s X
R2: s ::= A
R3: b ::= epsilon
R4: b ::= B
Grammar Two:
R1: s ::= A s D
R2: s ::= b c
R3: b ::= epsilon
R4: b ::= B
R5: c ::= epsilon
R6: c ::= C
Grammar TwoPrime: (Add the following rule)
R7: c ::= D
Grammar Three:
R01: s ::= b X
R02: s ::= A C
R03: s ::= c d
R04: b ::= epsilon
R05: b ::= B
R06: c ::= d e
R07: c ::= epsilon
R08: d ::= D
R09: d ::= epsilon
R10: e ::= A e e
R10: e ::= E
Reminder: Predictive parsing is top-down, left-to-right
Expansion of leftmost nonterminal is done completely through table lookup
How do we build the table?
* Easy thing:
* For each production, p_i, of the form (n ::= rhs)
For each terminal, T, in first(rhs)
TAB[n,T] = p_i
* Background question: What if exists a terminal, S, s.t. S is in two rhs's.
* The grammar is not parsable using this technique
* You can "left factor" the grammar
* Hard thing: What if rhs (in p_i: n ::= rhs) is nullable?
* Background question: What if multiple RHS's are nullable?
* Does it really matter which one you choose? Yes, if you assign meaning to the parse tree
* Such grammars are ambiguous
* We'll require that grammars not have two nullable rhs's for the same lhs.
* One naive strategy: For any cells not filled in, fill in the "nullable" rhs.
* A "better" strategy: For any terminals, T, that can follow n, TAB[n,T] = p_i
* First: Catches errors earlier
* Sometimes necessary to report errors in the grammar
Here's grammar two
R1: s ::= A s D
R2: s ::= b c
R3: b ::= epsilon
R4: b ::= B
R5: c ::= epsilon
R6: c ::= C
First[s] = { A, B, C } Nullable[s] = true
First[b] = { B } Nullable[b] = true
First[c] = { C } Nullable[c] = true
First[s] includes first(bc)
first(bc) = first(b) union (if Nullable[b] then first(c))
When do we apply rule R2? When we're trying to match an s and
(1) we see B or C. (By the "first(bc)" guidelines)
or (2) we're at the end of the input.
or (3) when we see a D
Let's watch this in action
Suppose input is A D ...
s =R1> A s D =matchA> s D =R2> b c D =R3> c D =R5> D =MatchD> END
Suppose input is A E ... (NOT IN OUR LANGUAGE)
s =>R1> A s D =matchA> s D
Crash! E is not in Follow[s], so there's a problem
A slight variant
R1: s ::= A s D
R2: s ::= b c
R3: b ::= epsilon
R4: b ::= B
R5: c ::= epsilon
R6: c ::= C
R7: c ::= D
First[s] = { A,B,C,D } Nullable[s] = true
First[b] = { B } Nullable[b] = true
First[c] = { C,D } Nullable[c] = true
TAB[s,A] = R1 TAB[s,B] = R2 TAB[s,C] = R2 TAB[s,D] = R2
TAB[s,EOI] = R2 (for nullification)
TAB[s,D] = R2 (for nullification)
TAB[c,A] = CRASH TAB[c,B] = CRASH TAB[c,C] = R6 TAB[c,D] = R7
TAB[c,EOI] = R5
TAB[c,D] = R5
Can EOI every follow c in a valid string? Yes, by R2
Why would you ever need to apply R5 upon seeing a D?
Input: AD$
Why would you ever need to apply R7 upon seeing a D?
Input: ADD$
How do you compute the Follow table?
Initialize:
* Follow[n] = empty_set for all nonterminals n
* Follow[start_symbol] = end-of-input ($)
REPEAT
For each rule n ::= rhs
For each "splitting" of the rhs into alpha m beta (m is a nonterminal)
Follow[m] := Follow[m] union first(beta)
if nullable[beta]
Follow[m] := Follow[m] union Follow[n]
UNTIL YOU STOP MAKING CHANGES
Some grammars are bad for predictive parsing:
* Left-overlap (Solution: Left factor)
s ::= A s
s ::= A b
Rewrite as
s ::= A extras
extras ::= s
extras ::= b
* Left-recursive
s ::= A x
s ::= s B
x ::= ...
Make right recursive
s => s s s s s s s s s s s s B
s => A x A x A x A x A x A x A x A x A x A x A x A x B
s is supposed to represent (A x)+ B
s ::= axstar A x B
axstar ::= A x axstar
axstar ::= epsilon