CSC362 2004S, Class 22: Shift-Reduce Parsing (3): Conclusion

Admin:
* Lab tomorrow
* No class Friday

Overview:
* Expression grammar, concluded.
* Problems with shift-reduce parsing.
* SLR parsing.
* LR(k) parsing.

* State: S16
    * Rules:
          o exp ::= exp + term .
          o term ::= term . * factor
          o term ::= term . / factor 
    * Edges:
          o * -> S10
          o / -> S11 
* How do we decide whether or not to reduce "exp ::= exp + term" in this situation?
  * Pure LR(0) parsers: "Error: Shift-Reduce conflict"
  * Hacked LR(0) parsers: If we see neither * nor /, reduce. [Bad idea]
    * Recall: We thought about doing something similar for the "reduce to epsilon" case in predictive parsers
  * If the next symbol is in Follow[lhs] (in this case, Follow[exp]), reduce
    SLR parsers ("Simple LR parsers")
* Almost all parser generators generate SLR parsers.

QUIZ: Suppose
  a ::= alpha
  a ::= beta
  a ::= gamma
And alpha is nullable
When do you apply/expand a ::= alpha in a predictive parser
* When the next symbol is in first(alpha)
* When you see End-of-Input [a partial solution]
* When you see something that's in neither first(alpha) nor first(beta) nor first(gamma) [HACK]
* When you see something that's in Follow[a]

  Follow[n] = { T in Tokens | s =>* .... n T ... }

* Why do we bother with this careful answer?
  * Robust error checking during parsing: If we see something not in Follow[a] (in this case), then we'll have an error later, so report NOW rather than later.
  * Robust error checkign of the grammar: Some grammars will not parse correctly unless we use this rule.

The "power" of a parser generator reflects the number of grammars/languages for which it can generate parsers.
* LL(1) is more powerful than DFA
* LR(0) is more powerful (I think) than LL(1)
* SLR is more powerful than LR(0)
  * Our expression grammar requires an SLR parser generator

Ambiguity is evil!

Are there other kinds of conflicts we might encounter in an LR parser?
* reduce-reduce conflicts

   a ::= A .
   b ::= A .
Once again, a follow table can help

Here's the grammar
   s ::= a C
   s ::= b D
   a ::= A
   b ::= A

Start state S0
   s' ::= . s $
   s ::= . a C
   s ::= . b D
   a ::= . A
   b ::= . A

Edge (S0,A) ::= S1

State S1
   a ::= A .
   b ::= A .
If the next token is C, reduce "a ::= A"
If the next token is D, reduce "b ::= A"

* "Follow" helps!

Can we have shift-shift conflicts?
* The construction of the parser prevents such conflicts

Are there grammars that are not parsable by SLR parsers?

  s ::= A C
  s ::= a B
  s ::= C a C
  a ::= A

State S0
  s' ::= . s $
  s ::= . A C
  s ::= . a B
  s ::= . C a C
  a ::= . A
  Edge(S0,s) -> S1
  Edge(S0,A) -> S2

State S1
  s' ::= s . $

State S2
  s ::= A . C
  a ::= A .
  * Warning!  Potential shift-reduce conflict
    If the next symbol is C, we should shift 
    If the next symbol is in Follow[a] = { B, C } , we should reduce

The trick for LR(1) parsers: Keep track of the portion of Follow of interest
State S0
  s' ::= . s $
  s ::= . A C  (When the s is followed by $)
  s ::= . a B  (When the s is followed by $)
  s ::= . C a C  (When the s is followed by $)
  a ::= . A (When the a is follow by B)
State S2
  s ::= A . C (When the s is followed by $)
  a ::= A . (When the a is followed by B)

Morals:
* SLR is insufficient for some grammars
* You can do better by building "contextual follow tables"
* These automata are often HUGE