CSC362 2004S, Class 21: Shift-Reduce Parsing (2) Admin: * Concluded Wednesday * Labs Thursday * No class Friday * Projects due * Time spent last week? * See me *early* * Sam's idea: Turn in what you have tonight by 10 p.m. * Include a short essay that points me to the best part of your code * Preferably with alternate parser that handles that (and input) * Include a summary of what you already know is wrong * Explain what made this part so hard * Sample sol'n to be written over break (I hope) * Suggestion for Sam: Violate copyright and put that grammar online * At the end of the semester, we'll look at speeding up the process * More smaller parts * Disambiguate and left-factor the grammar early Overview: * Shift-reduce automata * The expression grammar and its automaton * Building LR(0) automata Key idea in shift-reduce automata: * Automata with a stack * At each step, look at state of automaton (on top of the stack) and next input and do one of the following * shift a symbol onto the stack * reduce the stuff on top of the stack The states of the automaton represent the potential state of parsing of a set of rules. * When we reach a state that has the4 end of a RHS, we sometimes reduce. * In every other case, we try to follow an edge that corresponds to the next input symbol. Question: What if different RHS's are completed? * It depends on the construction algorithm. Shift-reduce automata are traditionally called LR automata * L: Scan input from left to right * R: Simulate a top-down right-to-left derivation Predictive parsers are LL * L: Scan input from left to right * L: Always derive the leftmost nonterminal Simplest LR automata (smallest size, easiest to build): LR(0) * Add rule 'start ::= exp EOI' * No recursion for the start symbol! * Explicitly mention end of input * Start state of the automaton begins as S00: { start ::= . exp EOI exp ::= . exp + term exp ::= . exp - term exp ::= . term term ::= . term * factor term ::= . term / factor term ::= . factor factor ::= . ID factor ::= . NUM factor ::= . ( exp ) } * AN expansion principle: If you might need an exp, add all the rules for exp * That principle, generalized: If a RHS contains . nonterm, add all the rules with nonterm on the left. Edge (S00,exp) -> S01 S01: { start ::= exp . EOI exp ::= exp . + term exp ::= exp . - term } Edge (S01, EOI) -> S07 Edge (S01, +) -> S08 Edge (S01, -) -> S09 Edge (S00,term) -> S02 S02: { exp ::= term . term ::= term . * factor term ::= term . / factor } Edge (S02,*) -> S10 Edge (S02,/) -> S11 Edge (S00,factor) -> S03 S03: { term ::= factor . } Edge (S00,ID) -> S04 S04: { factor ::= ID . } Edge (S00,NUM) -> S05 S05: { factor ::= NUM . } Edge (S00,OPEN) -> S06 S06: { factor ::= ( . exp ) exp ::= . exp + term exp ::= . exp - term exp ::= . term term ::= . term * factor term ::= . term / factor term ::= . factor factor ::= . ID factor ::= . NUM factor ::= . ( exp ) } Edge (S06, term): S02 Edge (S06, factor): S03 Edge (S06, ID): S04 Edge (S06, NUM): S05 Edge (S06, OPEN): S06 S07: { ACCEPT start ::= exp EOI . } S08: { exp ::= exp + . term term ::= . term * factor term ::= . term / factor term ::= . factor factor ::= . ID factor ::= . NUM factor ::= . ( exp ) } S09: { exp ::= exp - . term term ::= . term * factor term ::= . term / factor term ::= . factor factor ::= . ID factor ::= . NUM factor ::= . ( exp ) } S10: { term ::= term * . factor factor ::= . ID factor ::= . NUM factor ::= . ( exp ) } Edge(S10,factor) -> S12 Edge(S10,ID) -> S04 Edge(S10,NUM) -> S05 Edge(S10,OPEN) -> S06 S11 { term ::= term / . factor factor ::= . ID factor ::= . NUM factor ::= . ( exp ) } Edge(S11,factor) -> S13 Edge(S11,ID) -> S04 Edge(S11,NUM) -> S05 Edge(S11,OPEN) -> S06 S12 { term ::= term * factor . } S13 { term ::= term / factor . }