CSC362 2004S, Class 14: An Expression Grammar Admin: * Late: Sam (predicted in advance) * Note: I'll be late. I have permission from the instructor. * Are there questions on the project? * Are there comments on the lab? * We should strive to write better documentation and design better interfaces than those Sunny guys. Overview: * The expression grammar, revisited * Adding precedence * Adding associativity * Checking the grammar e ::= e binop e e ::= unop e e ::= ( e ) e ::= ID e ::= NUM binop ::= PLUS binop ::= MINUS binop ::= TIMES binop ::= DIVIDE CLAIM: This grammar is ambiguous Proof: The string NUM PLUS NUM TIMES NUM has at least two parse trees Observation: The order can matter, even for multiplication 0.5 * 2 * MAXREAL Lots of similar problems (1) Find some useful test cases (2) Define a priority "Multiplication and division take precedence over lowition and subtraction" (3) Explicate the problem in terms of the rules The first rule (e ::= e binop e) does not differentiate e ::= e highop e e ::= e lowop e e ::= unop e e ::= ( e ) e ::= ID e ::= NUM lowop ::= PLUS lowop ::= MINUS highop ::= TIMES highop ::= DIVIDE Now we need to work on those rules Look at the "highop" rule: e ::= e highop e Neither of the subexpressions should be able to contain an unparenthesized plus or minus e ::= e lowop e e ::= t t ::= t highop t t ::= unop e t ::= ( e ) t ::= ID t ::= NUM lowop ::= PLUS lowop ::= MINUS highop ::= TIMES highop ::= DIVIDE We seem to have resolved the prioritizing issue * Is this grammar still ambiguous? (Yes) NUM MINUS NUM MINUS NUM * Does this grammar define the same language as the original grammar? * Consider the sequence of productions from e to a string in the original language * If, in the original 'alpha e beta' => 'alpha ( e ) beta' In the new 'alpha e beta' => 'alpha t beta' => 'alpha ( e ) beta' * Similarly for every other production * This proves that the new language contains every string in the original language * The same proof techinque should work (show that you can simulate any derivation sequence) (2) Define a priority "Left associative" (3) Explicate the problem in terms of the rules e ::= e lowop e shouldn't have a low priority operator allowed on the right Add "f" ; things with neither high nor low priority operators e ::= e lowop t e ::= t t ::= t highop f t ::= f f ::= unop e f ::= ( e ) f ::= ID f ::= NUM lowop ::= PLUS lowop ::= MINUS highop ::= TIMES highop ::= DIVIDE e = "expression" - most general t = "term" - do not permit unparenthesized addition f = "factor" - do not permit unparenthesized addition or multiplication This grammar is unambiguous (yay) However, ... (1) It's a pain to extend (2) It was a good intellectual exercise to write, but we need to limit our intellectual exercise (3) Takes more steps to parse (4) Creates bigger trees Ad-hoc solution: Return a different tree than you parsed (in particular, use the original grammar for the returned tree)