CSC362 2004S, Class 22: Shift-Reduce Parsing (3): Conclusion Admin: * Lab tomorrow * No class Friday Overview: * Expression grammar, concluded. * Problems with shift-reduce parsing. * SLR parsing. * LR(k) parsing. * State: S16 * Rules: o exp ::= exp + term . o term ::= term . * factor o term ::= term . / factor * Edges: o * -> S10 o / -> S11 * How do we decide whether or not to reduce "exp ::= exp + term" in this situation? * Pure LR(0) parsers: "Error: Shift-Reduce conflict" * Hacked LR(0) parsers: If we see neither * nor /, reduce. [Bad idea] * Recall: We thought about doing something similar for the "reduce to epsilon" case in predictive parsers * If the next symbol is in Follow[lhs] (in this case, Follow[exp]), reduce SLR parsers ("Simple LR parsers") * Almost all parser generators generate SLR parsers. QUIZ: Suppose a ::= alpha a ::= beta a ::= gamma And alpha is nullable When do you apply/expand a ::= alpha in a predictive parser * When the next symbol is in first(alpha) * When you see End-of-Input [a partial solution] * When you see something that's in neither first(alpha) nor first(beta) nor first(gamma) [HACK] * When you see something that's in Follow[a] Follow[n] = { T in Tokens | s =>* .... n T ... } * Why do we bother with this careful answer? * Robust error checking during parsing: If we see something not in Follow[a] (in this case), then we'll have an error later, so report NOW rather than later. * Robust error checkign of the grammar: Some grammars will not parse correctly unless we use this rule. The "power" of a parser generator reflects the number of grammars/languages for which it can generate parsers. * LL(1) is more powerful than DFA * LR(0) is more powerful (I think) than LL(1) * SLR is more powerful than LR(0) * Our expression grammar requires an SLR parser generator Ambiguity is evil! Are there other kinds of conflicts we might encounter in an LR parser? * reduce-reduce conflicts a ::= A . b ::= A . Once again, a follow table can help Here's the grammar s ::= a C s ::= b D a ::= A b ::= A Start state S0 s' ::= . s $ s ::= . a C s ::= . b D a ::= . A b ::= . A Edge (S0,A) ::= S1 State S1 a ::= A . b ::= A . If the next token is C, reduce "a ::= A" If the next token is D, reduce "b ::= A" * "Follow" helps! Can we have shift-shift conflicts? * The construction of the parser prevents such conflicts Are there grammars that are not parsable by SLR parsers? s ::= A C s ::= a B s ::= C a C a ::= A State S0 s' ::= . s $ s ::= . A C s ::= . a B s ::= . C a C a ::= . A Edge(S0,s) -> S1 Edge(S0,A) -> S2 State S1 s' ::= s . $ State S2 s ::= A . C a ::= A . * Warning! Potential shift-reduce conflict If the next symbol is C, we should shift If the next symbol is in Follow[a] = { B, C } , we should reduce The trick for LR(1) parsers: Keep track of the portion of Follow of interest State S0 s' ::= . s $ s ::= . A C (When the s is followed by $) s ::= . a B (When the s is followed by $) s ::= . C a C (When the s is followed by $) a ::= . A (When the a is follow by B) State S2 s ::= A . C (When the s is followed by $) a ::= A . (When the a is followed by B) Morals: * SLR is insufficient for some grammars * You can do better by building "contextual follow tables" * These automata are often HUGE