Today in CS362: Finite Automata
Warning:The keyboard has a nasty space bar,so my
writing may be even worse than normal.
Admin:
RegExps due (email)
Look for Pascal token definitions
Standard ref: Jensen & Wirth "Pascal User Manual
and Report"
Regular expression for "C Comment"
Simplified "ab" (not ba)* "ba"
Overview:
What are finite automata?
DFAs
Examples
NFAs
Looking Ahead
Finite Automata:
A computational model for lexical analysis
Simple model: The program can be in different
"states" which represent the knowledge of what's
been seen so far. The program has "transitions"
from state to state based on the next input character
Generic FA routine
state := initial_state
while (!instream.eof()) {
state = follow-transition(state,instream.nextchar());
}
return (state.isAccepting());
Some examples:
ab*a
abb*ba
A Deterministic Finite Automaton (DFA) is a five-tuple
(Sigma: The alphabet
Q: a finite set of states
q_0 in Q: a designated "start state"
delta: state x symbol -> state: transition function
F subset Q: The "final states")
Emily's abb*ba automataon
Sigma: { a, b }
Q: { s0, s1, s2, s3, s4 }
q_0 is s0
F = { s4 }
delta(s0,a) = s1
delta(s1,b) = s2
delta(s2,b) = s3
delta(s3,a) = s4
delta(s3,b) = s3
Observations:
(1) Implicit failure state: No transition given, fail
(2) q_0, F, and delta are all that is really necc
(since everything else can be determined)
Challenge: Comments in C
Alternatively: "Strings over a-e that start with ab,
end with ba, and cannot have ba in the middle".
ab([^b]*|b[^a])*ba [WRONG]
ab([^b]*|b[^a])*b*ba [WHO KNOWS?]
Finite Automataon is "much" easier to build
Making life more fun: Nondeterminism
(1) There can be more than one transition from a
state that uses the same character.
(2) There can be "free" transitions that consume
no input characters
A string is accepted by a NFA if there is *some*
legal finite path to a final state (using that string
as input).
Theorem: Any language that can be expressed by an NFA
can be expressed by a DFA.
Proof: Instructions for converting any NFA to a DFA.
(Coming soon)
Theorem: Any language that can be expressed by a regexp
can be expressed by a NFA.
Proof: Instructions (coming soon)
Theorem: Any NFA or DFA can be expressed by a regular
expression
Proof: Instructions (take 341)
Looking ahead: Wednesday NFA->DFA
(+ extra)
Tuesday: RegExp lab
Starting on RegExp to NFA
Five basic kinds of regular expressions
Rules for converting each kind to an NFA
Key design idea: Every created NFA will have
exactly one start state and exactly one final
state.
Epsilon: