CSC362 2004S, Class 8: Finite Automata Admin: * Talk Monday 2:15: ACM Code of Ethics * Talk Monday 4:30: GUIs and HCI * Questions on my code? * Quiz! Overview: * Imperative vs. declarative lexical analysis * A simple model: The finite automaton * Formalizing the model * Drawing the model * Some examples * A string of a's and b's that starts and ends with a * Comments in C (or a variant thereof) * Nondeterministic finite automata * Limitations * Looking ahead About Sam's code: * Why are there these numeric constants if we never use them? * Because I wrote the interface before the code, and I try to never delete things from my interfaces QUIZ Write a regular expression for "strings that start with ab, end with ba, and have no intervening ba in the middle" Do not accept "aba" Alphabet is a,b,c,d * ababa is in the language * abba is the shortest string in the language * abaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba is also in the language * abacba is also in the language * abacadaba is also in the language /Followup to quiz/ * Why is this an interesting language? /* .... */ * Why do I give you quizzes? * Causes students to read the material in advance * Keep attention focused * We have to give you grades, this provides some input for those grades * Torture is fun * Reminds students to review material that they did not master * Helps professor gauge understanding * Insight into future stuff * Takes up class time * What answers did you give and what is wrong with them? * ^ab[a-d]*?ba$ NO FAIR ASSUMING "non-greedy matching" abbaba * ^ab[a-d]*+ba$ * ab([^b][^a]|[a-d])*ba * Includes abbaba * ab(aa|ab|ac|ad|bb|bc|bd|ca|cb|cc|cd|da|db|dc|dd)*ba * Only matches strings of even length (not ababa) * Admits ab ab ab ba * ab(b[^a]*|[^b]*)?ba * Probably fails to accept some strings in the language: abbcaba Moral of all this: Sometimes it's easier to write something imperatively than declaratively We need a computational alternative to regular expressions: * Simple and fast * Easy to translate to assembly * Looks enough like math that we puzzle outsiders * No more powerful than regular expressions Resulting model: Finite Automata Deterministic Finite Automata (DFA) * Computers have "state"; we can represent this state symbolically or numerically * Primary operation: Read a character and switch state * Termination: When you run out of characters, you're done * If the state you end up in says "it's in the language", it's in the language Formally, every DFA is a five-tuple: * Q is the set of states * q0 in Q is the designated "start state" * delta, a transition function from QxSigma to Q * F subset Q is a set of "final states" We design DFA's graphically