CSC362 2011F, Class 18: Predictive Parsing (2) Overview: * Predictive Parsing, Revisited. * The Epislon Problem. * Helpful tables: First, Follow, and Nullable * Building the First table. * Building the Nullable table. * Building the Follow table. * Generalizing the Parser. Admin: * There has been a request to move the exam to the Wednesday after fall break. * Done. * I'm still working on the next phase of the project and anticipate releasing it tonight. Remember! The more code I write, the less code you have to write. * Today is an early out for G-N Schools. I will have abbreviated office hours. * EC: DK's talk tomorrow at 4:30 pm in 3821. How might one do the identifier thing on the last lab? * Step zero: Add the table int values[10]; * Step one: Associate an integer with each identifier. id : _ID { \$\$ = atoi (yytext+1); } * Step two: Add a rule for identifiers as expressiosn factor : id { \$\$ = values[\$1]; } * Step three: Add a rule for assignment line : id _BECOMES exp { values[\$1] = \$3; printf ("%d\n", \$3); } Reminder: We are doing predictive parsing - * How to build a simple parser from BNF rules n1 : RHS1.1 | RHS1.2 | ... | RHS1.m * Repeatedly expand the next unexpanded nonterminal * Given the next terminal, S if S can begin RHS1.1, build n1 from RHS1.1 if S can begin RHS1.2, build n1 from RHS1.2 ... * Two problems: * What if S begins more than one RHS? * Pick the best one? (Which is the best) * Give up. This grammar is not amenable to this technique. * Rewrite the grammar so that S cannot begin more than one RHS * Look more than one symbol ahead (which may just delay the problem) * Arbitrary choice: Pick the first one * What if the RHS can derive epsilon? * Or part of the RHS can derive epsiolon? Sample grammar e : b X // rule 1 | C e // rule 2 | D // rule 3 ; b : // epsilon // rule 4 | B b // rule 5 ; Note: b is B* Matching e, see B: rule 1 C: rule 2 D: rule 3 X: rule 1 1 How do we do this automatically, so we don't ahve to do so much mental work. Build functions and tables * Two functions: * first(rhs: string-of-nonterminals-and-terminals) - what can begin this rhs * nullable(s: string-of-nonterinals-and-terminals) * Three tables * First[Nonterminal] - What symbols can begin strings derived from this nonterminal * Nullable[Nonterminal] - Can this nonterminal derive the empty string? * Follows[Nonterminal] - What terminal can follow this nonterminal in a derivation The follows table lets us deal with the epsilon problem Given a nonterminal n and input of X, if n is nullable and X is in Follows[n] then we can apply the rule that nullifies n first (str) if (empty (str)) return empty-set; else if is_terminal (car (str)) return singleton-set (car (str)) else // is_nonterminal (car (str)) result = First[car (str)] if (Nullable[car (str)]) result = first (cdr (str)); return result; nullable (str) if (empty (str)) return TRUE else if is_terminal (car (str)) return FALSE else // is_nonterminal (car (str)) return Nullable[car (str)) && nullable (cdr (str)) How do we build the First and Nullable tables? build_first () For all nonterminals, n First[n] = { } repeat changed = FALSE; for each rule n : rhs next = First[n] union first(rhs) if (next != First[n]) First[n] = next; changed = TRUE; until (!changed) build_nullable () For all nonterminals n Nullable[n] = FALSE repeat for each rule n : rhs if nullable(rhs) Nullable[n] = true; until no more changes are made