CSC362 2004S, Class 8: Finite Automata
Admin:
* Talk Monday 2:15: ACM Code of Ethics
* Talk Monday 4:30: GUIs and HCI
* Questions on my code?
* Quiz!
Overview:
* Imperative vs. declarative lexical analysis
* A simple model: The finite automaton
* Formalizing the model
* Drawing the model
* Some examples
* A string of a's and b's that starts and ends with a
* Comments in C (or a variant thereof)
* Nondeterministic finite automata
* Limitations
* Looking ahead
About Sam's code:
* Why are there these numeric constants if we never use them?
* Because I wrote the interface before the code, and I try to never delete things from my interfaces
QUIZ
Write a regular expression for "strings that start with ab, end with ba, and have no intervening ba in the middle" Do not accept "aba" Alphabet is a,b,c,d
* ababa is in the language
* abba is the shortest string in the language
* abaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba is also in the language
* abacba is also in the language
* abacadaba is also in the language
/Followup to quiz/
* Why is this an interesting language?
/* .... */
* Why do I give you quizzes?
* Causes students to read the material in advance
* Keep attention focused
* We have to give you grades, this provides some input for those grades
* Torture is fun
* Reminds students to review material that they did not master
* Helps professor gauge understanding
* Insight into future stuff
* Takes up class time
* What answers did you give and what is wrong with them?
* ^ab[a-d]*?ba$ NO FAIR ASSUMING "non-greedy matching"
abbaba
* ^ab[a-d]*+ba$
* ab([^b][^a]|[a-d])*ba
* Includes abbaba
* ab(aa|ab|ac|ad|bb|bc|bd|ca|cb|cc|cd|da|db|dc|dd)*ba
* Only matches strings of even length (not ababa)
* Admits ab ab ab ba
* ab(b[^a]*|[^b]*)?ba
* Probably fails to accept some strings in the language: abbcaba
Moral of all this: Sometimes it's easier to write something imperatively
than declaratively
We need a computational alternative to regular expressions:
* Simple and fast
* Easy to translate to assembly
* Looks enough like math that we puzzle outsiders
* No more powerful than regular expressions
Resulting model: Finite Automata
Deterministic Finite Automata (DFA)
* Computers have "state"; we can represent this state symbolically or numerically
* Primary operation: Read a character and switch state
* Termination: When you run out of characters, you're done
* If the state you end up in says "it's in the language", it's in the language
Formally, every DFA is a five-tuple:
* Q is the set of states
* q0 in Q is the designated "start state"
* delta, a transition function from QxSigma to Q
* F subset Q is a set of "final states"
We design DFA's graphically