CSC362 2004S, Class 20: Shift-Reduce Parsing (1): Introduction Admin: * Thoughts on yesterday's lab? * Questions on phase 2? * Come to tonight's show with "Really Bland Lighting" * Today: Move ahead to shift-reduce parsing. Please read the textbook. Overview: * The basic concept * An example: an(bn|cn) * An example: Expressions * Building shift-reduce parsers Some languages are not predictively parsable, such as A^n(B^n|C^n) n>=0 a ::= b a ::= c b ::= A b B b ::= epsilon c ::= A c C c ::= epsilon Left factor! a ::= A more a ::= epsilon more ::= b B more ::= c C b ::= A b B b ::= epsilon c ::= A c C c ::= epsilon Left factor! a ::= A more a ::= epsilon more ::= A moremore more ::= B more ::= C moremore ::= b B B moremore ::= c C C b ::= A b B b ::= epsilon c ::= A c C c ::= epsilon Broad picture: Distinguising betweek A^k B^k and A^k C^k requires k+1 lookahead * No matter what lookahead you choose for your predictive parser, there exist pairs of strings you cannot predictively parse Ideas: * Build trees bottom-up instead of top-down * Convert right-hand-sides to left-hand-sides (rather than attempting to expand a left-hand side) * Use a stack to keep track of the things not yet (partially converted) * Magically build a table that, given a next input symbol and knowledge of what's on the stack, picks "something" to do next * "shift" the next input token onto the stack * "reduce" the top few elements of the stack (a rhs) to the corresponding lhs * If we exhaust the input and end up with the start symbol on the stack, we win a ::= b a ::= c b ::= A b B b ::= epsilon c ::= A c C c ::= epsilon (input: A, top: anything): shift (input: B, top: A): reduce b ::= epsilon (push 'b') (input: B, top: A b B): reduce b ::= A b B (pop A b B, push b) (input: EOI, top: A b B): reduce b ::= A b B (pop A b B, push b) (input, B, top: b): shift B (input: C, top: A): reduce c ::= epsilon (push 'c') (input: C, top: A c C): reduce c ::= A c C (pop A c C, push c) (input: EOI, top: A c C): reduce c ::= A c C (pop A c C, push c) (input: C, top: c): shift C (input: EOI, top: b): reduce a ::= b (input: EOI, top: c): reduce a ::= c (input: NUM, stack: anything) -> shift (input: ADDOP, stack: NUM) -> reduce NUM to factor (input: ADDOP, stack: term MULOP factor) -> reduce term MULOP factor to term (input: ADDOP, stack: factor) -> reduce factor to term (input: ADDOP, stack: expression ADDOP term) -> reduce expression ADDOP term to expression (input: ADDOP, stack: term) -> reduce term to expression (input: ADDOP, stack: expression) -> shift (input: MULOP, stack: NUM) -> reduce NUM to factor (input: MULOP, stack: term MULOP factor) -> reduce term mulop factor to term (input: MULOP, stack: factor) -> reduce factor to term (input: MULOP, stack: term) -> shift There's some related stuff for end-of-input not shown. Goal of shift-reduce parsing: Write an algorithm that converts a grammar to a shift-reduce table. Key idea similar to NFA to DFA conversion: * States of the DFA are power sets of states of the NFA * Here, we'll build an automaton in which states of the automaton are sets of "annotated productions" - "You could have seen this much of the rhs at this point" * Edges are both terminals and nonterminal Hackzored