CSC362 2004S, Class 14: An Expression Grammar
Admin:
* Late: Sam (predicted in advance)
* Note: I'll be late. I have permission from the instructor.
* Are there questions on the project?
* Are there comments on the lab?
* We should strive to write better documentation and design better interfaces than those Sunny guys.
Overview:
* The expression grammar, revisited
* Adding precedence
* Adding associativity
* Checking the grammar
e ::= e binop e
e ::= unop e
e ::= ( e )
e ::= ID
e ::= NUM
binop ::= PLUS
binop ::= MINUS
binop ::= TIMES
binop ::= DIVIDE
CLAIM: This grammar is ambiguous
Proof: The string NUM PLUS NUM TIMES NUM has at least two parse trees
Observation: The order can matter, even for multiplication
0.5 * 2 * MAXREAL
Lots of similar problems
(1) Find some useful test cases
(2) Define a priority
"Multiplication and division take precedence over lowition and subtraction"
(3) Explicate the problem in terms of the rules
The first rule (e ::= e binop e) does not differentiate
e ::= e highop e
e ::= e lowop e
e ::= unop e
e ::= ( e )
e ::= ID
e ::= NUM
lowop ::= PLUS
lowop ::= MINUS
highop ::= TIMES
highop ::= DIVIDE
Now we need to work on those rules
Look at the "highop" rule: e ::= e highop e
Neither of the subexpressions should be able to contain an unparenthesized plus or minus
e ::= e lowop e
e ::= t
t ::= t highop t
t ::= unop e
t ::= ( e )
t ::= ID
t ::= NUM
lowop ::= PLUS
lowop ::= MINUS
highop ::= TIMES
highop ::= DIVIDE
We seem to have resolved the prioritizing issue
* Is this grammar still ambiguous? (Yes)
NUM MINUS NUM MINUS NUM
* Does this grammar define the same language as the original grammar?
* Consider the sequence of productions from e to a string in the original language
* If, in the original 'alpha e beta' => 'alpha ( e ) beta'
In the new 'alpha e beta' => 'alpha t beta' => 'alpha ( e ) beta'
* Similarly for every other production
* This proves that the new language contains every string in the original language
* The same proof techinque should work (show that you can simulate any derivation sequence)
(2) Define a priority
"Left associative"
(3) Explicate the problem in terms of the rules
e ::= e lowop e
shouldn't have a low priority operator allowed on the right
Add "f" ; things with neither high nor low priority operators
e ::= e lowop t
e ::= t
t ::= t highop f
t ::= f
f ::= unop e
f ::= ( e )
f ::= ID
f ::= NUM
lowop ::= PLUS
lowop ::= MINUS
highop ::= TIMES
highop ::= DIVIDE
e = "expression" - most general
t = "term" - do not permit unparenthesized addition
f = "factor" - do not permit unparenthesized addition or multiplication
This grammar is unambiguous (yay)
However, ...
(1) It's a pain to extend
(2) It was a good intellectual exercise to write, but we need to limit our intellectual exercise
(3) Takes more steps to parse
(4) Creates bigger trees
Ad-hoc solution: Return a different tree than you parsed (in particular, use the original grammar for the returned tree)