CSC362 2011F, Class 18: Predictive Parsing (2)
Overview:
* Predictive Parsing, Revisited.
* The Epislon Problem.
* Helpful tables: First, Follow, and Nullable
* Building the First table.
* Building the Nullable table.
* Building the Follow table.
* Generalizing the Parser.
Admin:
* There has been a request to move the exam to the Wednesday after fall break.
* Done.
* I'm still working on the next phase of the project and anticipate releasing it tonight. Remember! The more code I write, the less code you have to write.
* Today is an early out for G-N Schools. I will have abbreviated office hours.
* EC: DK's talk tomorrow at 4:30 pm in 3821.
How might one do the identifier thing on the last lab?
* Step zero: Add the table
int values[10];
* Step one: Associate an integer with each identifier.
id : _ID { $$ = atoi (yytext+1); }
* Step two: Add a rule for identifiers as expressiosn
factor : id { $$ = values[$1]; }
* Step three: Add a rule for assignment
line : id _BECOMES exp { values[$1] = $3; printf ("%d\n", $3); }
Reminder: We are doing predictive parsing -
* How to build a simple parser from BNF rules
n1 : RHS1.1
| RHS1.2
| ...
| RHS1.m
* Repeatedly expand the next unexpanded nonterminal
* Given the next terminal, S
if S can begin RHS1.1, build n1 from RHS1.1
if S can begin RHS1.2, build n1 from RHS1.2
...
* Two problems:
* What if S begins more than one RHS?
* Pick the best one? (Which is the best)
* Give up. This grammar is not amenable to this technique.
* Rewrite the grammar so that S cannot begin more than one RHS
* Look more than one symbol ahead (which may just delay the problem)
* Arbitrary choice: Pick the first one
* What if the RHS can derive epsilon?
* Or part of the RHS can derive epsiolon?
Sample grammar
e : b X // rule 1
| C e // rule 2
| D // rule 3
;
b : // epsilon // rule 4
| B b // rule 5
;
Note: b is B*
Matching e, see
B: rule 1
C: rule 2
D: rule 3
X: rule 1
1
How do we do this automatically, so we don't ahve to do so much
mental work.
Build functions and tables
* Two functions:
* first(rhs: string-of-nonterminals-and-terminals) - what can begin this rhs
* nullable(s: string-of-nonterinals-and-terminals)
* Three tables
* First[Nonterminal] - What symbols can begin strings derived from this nonterminal
* Nullable[Nonterminal] - Can this nonterminal derive the empty string?
* Follows[Nonterminal] - What terminal can follow this nonterminal in a derivation
The follows table lets us deal with the epsilon problem
Given a nonterminal n and input of X,
if n is nullable
and X is in Follows[n]
then we can apply the rule that nullifies n
first (str)
if (empty (str))
return empty-set;
else if is_terminal (car (str))
return singleton-set (car (str))
else // is_nonterminal (car (str))
result = First[car (str)]
if (Nullable[car (str)])
result = first (cdr (str));
return result;
nullable (str)
if (empty (str))
return TRUE
else if is_terminal (car (str))
return FALSE
else // is_nonterminal (car (str))
return Nullable[car (str)) && nullable (cdr (str))
How do we build the First and Nullable tables?
build_first ()
For all nonterminals, n
First[n] = { }
repeat
changed = FALSE;
for each rule n : rhs
next = First[n] union first(rhs)
if (next != First[n])
First[n] = next;
changed = TRUE;
until (!changed)
build_nullable ()
For all nonterminals n
Nullable[n] = FALSE
repeat
for each rule n : rhs
if nullable(rhs)
Nullable[n] = true;
until no more changes are made