CSC362 2011F, Class 15: Parsing Expressions
Overview:
* Comments in C
* BNF Grammars, Formalized.
* Looking Ahead.
* A Grammar for Expressions.
* Disambiguating the Expression Grammar.
Admin:
* EC for Thursday's CS extra.
* CS picnic, Friday, 7 October 2011
* Questions
Question: What's a regular expression for comments in C?
Note: It's a PITN to distinguish the star character from
Kleene star, so I'll use 'S' for slash and 's' for star.
So we want "Strings that start with Ss, end with sS, and have
no internal sS"
* Doesn't work: Ss.*sS
* Accepts SssSsS
* Doesn't work: Ss(([^s]*)(s[^S])?)*sS
* Does not accept SsssS
* Sam thinks will work: Ss([^s]s[^S])*s*sS
* And you wonder why some people describe regular expressions as
"read only"
* Online version
http://ostermiller.org/findcomment.html
* Similar enough
BNF
nonterminal ::= pattern
Informal
* Lots of rules
Formal:
* A BNF grammar is a fourtuple
N, a set of symbols (the nonterminals)
S, a set of symbols (the tokens or terminals)
Note: Describe sets of strings over S
Start in N, a designated start symbol
R, a set of rules of the form
nonterminal ::= (SN)*
* A string s0 s1 ... sn is in the language described by this grammar
iff
Starting with Start and repeatedly applying productions, you end
up with s0 ... sn
Simple grammar: Palindromes over A and B
N = { pal }
Start = pal
S = { A, B }
pal : empty string
 A pal A
 B pal B
 A
 B
;
Is ABABA a palindrome?
pal > A pal A
> A B pal B A
> A B A B A
One way to use BNF grammars: Try to derive the string (top down)
Another way: Bottom up
A B A B A
    
 pal 
   
 pal 

pal
Looking ahead:
* Try to find a way to simulate the nondeterminism of BNF grammars
byu translating them to deterministic code
* Won't be as easy
* We'll need to limit the grammars
* We'll also do the "same" thing we did with regular expression
* Each rule has code that gets triggered when it is applied
An interesting example: Infix expressions over integers and identifiers
* Base cases
exp : _INTEGER
 _IDENTIFIER
* Standard binary operations
exp : exp binop exp
* Standard unary operations
 unop exp
* Hmm ++3 Maybe not
integer : _UINT
 unop _UINT
* We'll pretend that there are no unary operations
* Parenthesized expression
exp : _LPAREN exp _RPAREN
Putting it together
exp : _INTEGER
 _IDENTIFIER
 exp binop exp
 _LPAREN exp _RPAREN
;
Describes expressions correctly
But *ambiguious* multiple parse trees
Ambiguity is dangerous
How do we fix? A question for next class