[Skip to Body]
Primary:
[Front Door]
[Current]
[Glance]
-
[Honesty]
[Instructions]
[Links]
[Search]
Groupings:
[EBoards]
[Examples]
[Exams]
[Handouts]
[Homework]
[Labs]
[Outlines]
[Project]
[Readings]
[Reference]
Misc:
[2001S]
[2002F]
[SamR]
Summary:
In today's lab, you will explore the standard Unix lexical analyzer generator,
lex
.
Collaboration: Feel free to work on this lab in pairs or trios.
Turning It In: Don't bother turning this lab in. Do, however, ask me questions when you find that you are confused.
Background:
lex
(or flex
, in its GNU implementation) is one
of the standard lexical analyzer generators. You can use lex
to build lexers which interact with parsers. You can also use
lex
to build programs that react in other ways to regular
expressions they encounter. We will focus on the second use today,
and write lex programs that print output that depends on the input
they receive.
You can get more information on flex
from the man page.
Please don't print the man page; it's over sixty pages long.
Structure of a Lex File:
c-code-prefixed-by-tabs-and-named-regular-expressions %% pattern1 c-code-to-execute-upon-matching-pattern pattern2 c-code-to-execute-upon-matching-pattern ... patternn c-code-to-execute-upon-matching-pattern %% more-c-code
Sample Lex File:
%% " " { } \n { printf("Howdy!\n"); } begin { printf("Keyword-Begin\n"); } [a-z][a-z0-9_]* { printf("Look, an identifier: '%s'\n",yytext); } %% int yywrap() { return 1; }
The yywrap
function gets called by the lexer whenever it
hits the end of file. If yywrap
returns a nonzero value,
the lexer stops.
Using Lex Files:
Suppose we saved this as example.yy
(yes, .yy
is the normal suffix for lex
files). To build an
executable from it, you would do the following:
% flex example.yy # Generates file lex.yy.c % gcc -lfl lex.yy.c # Generates file a.out % a.out < file
Problems:
1. Figure out what the example file does with things not described in the regular expressions.
2.a. Determine whether flex follows the longest token rule (e.g., whether, in the example above, given input of begine
it matches the keyword begin
or the identifier begine
).
2.b. If flex does follow the longest token rule, determine what flex does if it has to backtrack. For example, given keywords of begin and beginning (and no identifiers), what does it do if it sees beginn
. (You'll need to write your own flex proram to test this issue.)
3. Write separate flex programs to separate input in each of the following ways. (I'd prefer that you print out things with their type.) The pattern/action rules in your program will generally look like
pattern { printf("DESCRIPTIVE-WORD[%s]\n", yytext); }
For example, the output of the first program might be something like the following:
WHITESPACE[ ] NORMAL[11February2004] WHITESPACE[ ] NORMAL[Dear] WHITESPACE[ ] NORMAL[Fred,]
4. Look at one of the copies of lex.yy.c
and see if you can figure out
where and how the finite automaton is coded. You might want to try
variants of the lex
program to see what difference it makes.
Monday, 16 September 2002 [Samuel A. Rebelsky]
Tuesday, 17 September 2002 [Samuel A. Rebelsky]
Wednesday, 11 February 2004 [Samuel A. Rebelsky]
Thursday, 12 February 2004 [Samuel A. Rebelsky]
[Skip to Body]
Primary:
[Front Door]
[Current]
[Glance]
-
[Honesty]
[Instructions]
[Links]
[Search]
Groupings:
[EBoards]
[Examples]
[Exams]
[Handouts]
[Homework]
[Labs]
[Outlines]
[Project]
[Readings]
[Reference]
Misc:
[2001S]
[2002F]
[SamR]
Disclaimer:
I usually create these pages on the fly
, which means that I rarely
proofread them and they may contain bad grammar and incorrect details.
It also means that I tend to update them regularly (see the history for
more details). Feel free to contact me with any suggestions for changes.
This document was generated by
Siteweaver on Wed May 5 11:46:48 2004.
The source to the document was last modified on Thu Feb 12 09:44:10 2004.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS362/2004S/Labs/lab.04.html
.
You may wish to validate this document's HTML ; ; Check with Bobby
Samuel A. Rebelsky, rebelsky@grinnell.edu