CSC362 2004S, Class 20: Shift-Reduce Parsing (1): Introduction
Admin:
* Thoughts on yesterday's lab?
* Questions on phase 2?
* Come to tonight's show with "Really Bland Lighting"
* Today: Move ahead to shift-reduce parsing. Please read the textbook.
Overview:
* The basic concept
* An example: an(bn|cn)
* An example: Expressions
* Building shift-reduce parsers
Some languages are not predictively parsable, such as
A^n(B^n|C^n) n>=0
a ::= b
a ::= c
b ::= A b B
b ::= epsilon
c ::= A c C
c ::= epsilon
Left factor!
a ::= A more
a ::= epsilon
more ::= b B
more ::= c C
b ::= A b B
b ::= epsilon
c ::= A c C
c ::= epsilon
Left factor!
a ::= A more
a ::= epsilon
more ::= A moremore
more ::= B
more ::= C
moremore ::= b B B
moremore ::= c C C
b ::= A b B
b ::= epsilon
c ::= A c C
c ::= epsilon
Broad picture: Distinguising betweek A^k B^k and A^k C^k requires k+1 lookahead
* No matter what lookahead you choose for your predictive parser, there exist pairs of strings you cannot predictively parse
Ideas:
* Build trees bottom-up instead of top-down
* Convert right-hand-sides to left-hand-sides (rather than attempting to expand a left-hand side)
* Use a stack to keep track of the things not yet (partially converted)
* Magically build a table that, given a next input symbol and knowledge of what's on the stack, picks "something" to do next
* "shift" the next input token onto the stack
* "reduce" the top few elements of the stack (a rhs) to the corresponding lhs
* If we exhaust the input and end up with the start symbol on the stack, we win
a ::= b
a ::= c
b ::= A b B
b ::= epsilon
c ::= A c C
c ::= epsilon
(input: A, top: anything): shift
(input: B, top: A): reduce b ::= epsilon (push 'b')
(input: B, top: A b B): reduce b ::= A b B (pop A b B, push b)
(input: EOI, top: A b B): reduce b ::= A b B (pop A b B, push b)
(input, B, top: b): shift B
(input: C, top: A): reduce c ::= epsilon (push 'c')
(input: C, top: A c C): reduce c ::= A c C (pop A c C, push c)
(input: EOI, top: A c C): reduce c ::= A c C (pop A c C, push c)
(input: C, top: c): shift C
(input: EOI, top: b): reduce a ::= b
(input: EOI, top: c): reduce a ::= c
(input: NUM, stack: anything) -> shift
(input: ADDOP, stack: NUM) -> reduce NUM to factor
(input: ADDOP, stack: term MULOP factor) -> reduce term MULOP factor to term
(input: ADDOP, stack: factor) -> reduce factor to term
(input: ADDOP, stack: expression ADDOP term) -> reduce expression ADDOP term to expression
(input: ADDOP, stack: term) -> reduce term to expression
(input: ADDOP, stack: expression) -> shift
(input: MULOP, stack: NUM) -> reduce NUM to factor
(input: MULOP, stack: term MULOP factor) -> reduce term mulop factor to term
(input: MULOP, stack: factor) -> reduce factor to term
(input: MULOP, stack: term) -> shift
There's some related stuff for end-of-input not shown.
Goal of shift-reduce parsing: Write an algorithm that converts a
grammar to a shift-reduce table.
Key idea similar to NFA to DFA conversion:
* States of the DFA are power sets of states of the NFA
* Here, we'll build an automaton in which states of the automaton are sets of "annotated productions" - "You could have seen this much of the rhs at this point"
* Edges are both terminals and nonterminal
Hackzored