Today in CS362: Shift-Reduce Parsing

Missing
  Adam, Erik

Admin
  HW2 Due!
  Project, Phase 2 coming soon
  Read the Red Dragon
  Putnam Exam
  Grading conference papers

Overview
  Ideas behind shift-reduce parsing
  Our fun expression example, revisited
  How to build a shift-reduce parser, started

What is predictive-parsing?
  To build a predictive parser, you build three helper
    tables.
  Left-to-right parsing technique
  Involves looking at terminals
  Not all grammars are amenable
  You can tell from context and the next terminal what
    production to apply

  Top-down left-right parsing procedure
    Start with a nonterminal and repeatedly rpelace
    the leftmost nonterminal
  You can make the decision as to which production to
    use in replacing the leftmost nonterminal by looking
    at that nonterminal and a the next input symbol or
    symbols.

What is shift reduce parsing?
  Bottom-up parsing technique
  Repeatedly replaces substrings with nonterminals
    until we hit start symbol or crash-and-burn
  Which substring do we replace?  The leftmost one
    which matches a right-hand-side "appropriately"
  We call this a LR parsing
    (1) Looks at input from left to right
    (2) Simulates *rightmost* derivation
  What about "shifting" and "reducing"?
  key implementation ideas: 
    * Use a stack
    * Two options upon seeing an input symbol:
      + SHIFT it onto the stack
      + REDUCE a RHS on the stack
    * Behind the scenes: The stack also contains some
      extra values that describe what's on the stack
    * Our decision is made by a table (that you'll soon
      learn how to build)

Stack | Input
Bottom Top | Front Back
$ | 3 + 4 * 5 + 6 $
SHIFT
  $ 3 | + 4 * 5 + 6 $
REDUCE F -> num
  $ F | + 4 * 5 + 6 $
REDUCE T -> F
  $ T | + 4 * 5 + 6 $
REDUCE E -> T
  $ E | + 4 * 5 + 6 $
SHIFT 
  $ E + | 4 * 5 + 6 $
SHIFT 
  $ E + 4 | * 5 + 6 $
REDUCE F -> num
  $ E + F | * 5 + 6 $
REDUCE T -> F
  $ E + T | * 5 + 6 $
SHIFT 
  $ E + T * | 5 + 6 $
SHIFT 
  $ E + T * 5 | + 6 $
REDUCE F -> num
  $ E + T * F | + 6 $
REDUCE T -> T * F
  $ E + T | + 6 $
REDUCE E -> E + T
  $ E | + 6 $

How do we build the table (other than relying on intuition?)
* There are a lot of similar different strategies for building
  shift-reduce tables
* LR(k): Build the table using k tokens of lookahead
* LR(0): No tokens of lookahead

Idea: Build a automaton
* States of that automaton are sets of "potential partial 
  matches of RHS's"
* Transitions are labeled by nonterminals and terminals
* Need to add new special start production,
  S' -> S $
* Notation: . marks "here's where I am in the RHS"

State 0:
 S' -> . E $
 E -> . E + T
 E -> . T
 T -> . T * F
 T -> . F
 F -> . num
 F -> . ( E )

State 1:
 S' -> E . $
 E -> E . + T

State 2:
 S' -> E $ .
 MATCHED

State 3:
  E -> T .
  REDUCE (sometimes)
  E -> T . * F

State 4:
  E -> E + . T

State 5:
  E -> E + T .
  REDUCE

State 6:
  T -> F .
  REDUCE (sometimes)

State 7:
  F -> num .
  REDUCE (always)

Transition table
 (0,E) -> 1
 (0,T) -> 3
 (0,F) -> 6
 (0,num) -> 7
 (1,$) -> 2
 (1,+) -> 4
 (4,T) -> 5

REALLY SIMPLE EXAMPLE

$ S0 E S1 $ S2 |