**Primary:**
[Skip To Body]
[Front Door]
[Current]
[Glance]
[Honesty]
[Instructions]
[Links]
[Search]
[Syllabus]

**Sets:**
[EBoards]
[Examples]
[Exams]
[Handouts]
[Homework]
[Labs]
[Outlines]
[Project]
[Readings]
[Reference]

**ECA:**
[About]
[Grades]
[Quizzes]
[Submit Work]
[Change Password]
[Reset Password]

**Miscellaneous:**
[2001S]
[98F]
[SamR]
[Glimmer Labs]

Back to Finite Automata. On to From Specification to Optimal DFA, Continued.

**Held** Wednesday, September 11, 2002

**Summary**

Today we start to consider how to move from the readable but declarative regular expression notation to the executable but obtuse finite automaton notation.

**Notes**

- A few of you didn't show up to lab yesterday and didn't warn me in advance. That's a no no.
- The CS picnic is next Friday, Sept. 20th. Please sign up for the picnic. (And yes, you can bring your parents, but include a note to that effect).
- I plan to assign the lexical analyzer on Friday or Monday.

**Assignments**

- Do Homework 1 (Due Wednesday, September 17, 2002)

**Overview**

- From regular expressions to NFAs
- An example
- From NFAs to DFAs
- Example continued

- We've now seen two mechanisms for describing tokens (and, in effect, for tokenizing): regular expressions and finite automata.
- How do we go from one to another?
- We convert regular expressions to nondeterministic finite automata using a surprisingly naive translation.
- We convert NFAs to DFAs.
- We then
optimize

the DFAs.

- Our regular expression to NFA converter builds a finite automaton with one final state (and one start state). It uses the recursive definitions of regular expressions to build a finite automaton for a compound expression from the finite automata for the base expressions.
- Each created NFA has a unique start state and a unique final state.
- The NFA for the empty string has two states with an epsilon transition between the states. The first state is the start state. The second state is the final state.
- The NFA for a single symbol is similar, except that the transition is labeled with the symbol.
- The NFA for alternation has a new start state with epsilon edges to the start states of the NFA for the subexpressions and a new final state with epsilon edges from the final states of the NFA for the subexpressions.
- The finite automaton for concatenation uses the start state of the
NFA for the first expression and the final state of the
NFA for the second expression. There is an epislon transition
from the final state of the first NFA to the start state of
the second NFA.
- Alternatively, we can merge the final state of the first NFA with the start state of the second NFA.

- The NFA for repetition has a new start state and a
new final state. There are also epsilon transitions
- from the start state to the final state,
- from the start state to the start state of the sub-NFA,
- from the final state of the sub-NFA to the final state, and
- from the final state of the sub-NFA to the start state of the sub-NFA.

- If it seems to you that some of these extra states and epsilon transitions
are unnecessary, you're probably right. However, you can't get rid of all
of them.
- For example, consider what happens if you don't create any extra states for concatenation or repetition.

- We'll develop some examples
on the fly

. - I'll suggest that we do
integers in C

- (([1-9][0-9]*)|(0[1-9]*)|(0x[1-9A-Fa-f]+))[sSuU]?[lL]?

- How do we turn this lovely NFA into a deterministic computing machine?
- Basically, we build a DFA that simultaneously follows all the paths that
we can go through in the NFA.
- Each state in the DFA is a set of states in the corresponding NFA.

- The algorithm is again fairly simple
- Terminology :
- qx is a state in the NFA and q0 is its start state
- QX is a state in the DFA and Q0 is its start state
- The eqsilon-closure of a DFA state consists of all the states
in the NFA that we can get to with epsilon moves from the states
in the DFA state.
Q0 = { q0 }

*// but there are some states we can reach from q0 at no cost*Q0 = epsilon-closure(Q0) while there are states we haven't processed pick one such state, Qn for each symbol s let tmp be a new, empty, set for each q in Qn add delta(q,s) to tmp end for tmp = epsilon-closure(tmp) if tmp is not in the DFA then let Qi be a new state Qi = tmp add Qi to the DFA else let Qi be the state equivalent to tmp end if add an edge from Qn to Qi in the automaton end for end while for each Qi if there is a q in Qi that is a final state then Qi is a final state end if end for

Back to Finite Automata. On to From Specification to Optimal DFA, Continued.

**Primary:**
[Skip To Body]
[Front Door]
[Current]
[Glance]
[Honesty]
[Instructions]
[Links]
[Search]
[Syllabus]

**Sets:**
[EBoards]
[Examples]
[Exams]
[Handouts]
[Homework]
[Labs]
[Outlines]
[Project]
[Readings]
[Reference]

**ECA:**
[About]
[Grades]
[Quizzes]
[Submit Work]
[Change Password]
[Reset Password]

**Miscellaneous:**
[2001S]
[98F]
[SamR]
[Glimmer Labs]

**Disclaimer**:
I usually create these pages on the fly

, which means that I rarely
proofread them and they may contain bad grammar and incorrect details.
It also means that I tend to update them regularly (see the history for
more details). Feel free to contact me with any suggestions for changes.

This document was generated by
Siteweaver on Fri Dec 6 10:37:57 2002.

The source to the document was last modified on Wed Sep 4 10:08:33 2002.

This document may be found at `http://www.cs.grinnell.edu/~rebelsky/Courses/CS362/2002F/Outlines/outline.06.html`

.

You may wish to validate this document's HTML ; ; Check with Bobby

Glimmer Labs: The Grinnell Laboratory for Interactive Multimedia Experimentation & Researchglimmer@grinnell.edu