CS362 2011F Compilers

Project, Phase 5: Simple Code Generation

We have now completed the front end of the compiler. We know how tokenize, parse, and analyze programs. We have even written programs to do these key aspects.

It is now time to generate intermediate code.

This stage of the compiler is quite important. Since Pascal is a moderate sized language, it is also a stage that we will break into multiple phases.

There are a few aspects to this stage of the program.

If our programs have no procedures and control, what statements do we have to worry about?

We also need to consider the allocation of variables. As you may recall from a previous discussion in class, you will generally allocate variables on the stack. To keep track of the variables allocated for a particular procedure (or for the program as a whole), you need to develop an activation record.

In the previous assignment, you assigned a type to every expression. In this assignment, you will have to assign a location to every expression. That is, where can we find the value of that expression in memory at run time. (Don't worry about dealing with Boolean expressions right now.)

Activation Records

It is up to you how to deal with activation records.

I would suggest that you keep a stack of activation records and provide the following functions:

As you will note, the only important information we store about the current activation record is the amount of memory it requires. (We do store the offset of each variable, but that gets associated with the variables and need not be part of the data structure itself.


More About Activation Records

As you may recall, when we're writing compilers, we have to keep two different times in mind - the time at which we are compiling the code and the time at which our program is running. At compile time, we make decisions that affect the code that gets executed at run time.

At run time, each executing procedure (as well as the top-level program) has its own stack frame where its parameters, local variables, temporaries, and other data are stored. The compile-time equivalent of the stack frame is the activation record. In the abstract, the activation record tells us (a) how much space the stack frame for the procedure will need at run time and (b) how that space is laid out.

In practice, the information for the current activation record is somewhat spread out. In one place, we have an indication of how much space has currently been allocated to the current activation record. In the symbol table, we keep track of the location of any named thing within the activation record. And in the parse tree, we keep track of the location of many things (e.g., expression results) within the activation record.

For example, consider the following function and suppose we have sixteen bytes of bookkeeping information in each stack frame and that each integer requires four bytes.

function f (int x, int y): integer;
 var z: integer;
begin
  z := x + y;
  f := z * z;
end;

Now, that analysis was for a function. Of course, you're not dealing with functions at this stage of your program. However, we do have a stack frame for the top-level variables, and that's your main problem for this phase of the compiler. That is, we still need to deal with the top-level activation record and will want to do so in a way that extends naturally to nested declarations.

So, we'll keep a stack of information on activation records. For the current activation record, all we need to store is how much space it requires. When we see a parameter, local, or, temporary, we increment that counter, get the information, and store it in the appropriate place (the symbol table for parameters and local variables, in the parse tree for temporaries).


So, What Do We Have To Do For This Assignment?

1. Implement a stack of activation records. For now, you'll only deal with the first activation record (that for the top-level program). You will probably put this in pascal.y.

2. Implement something that lets you build a STAC program (an array of STAC instructions). You can look at stac.y for ideas.

3. Deal with variable declarations in pascal.y. For each variable declaration, you need to allocate space in the activation record and update the symbol table to store information on the location associated with a variable.

4. Deal with expressions. Each expression should have a type and an address.

a. For expressions that are identifiers, you can look up the address in the symbol table. You'll then store that address in the parse tree for use by other rules.

b. For expressions that are constants, you can generate a new address with new_iconstant or new_fconstant. Once again, you'll store that address in the parse tree.

c. For compound expressions, you'll need to allocate space in the activation record for the result (remember, this space really only gets created at run time, and it gets created when we create the stack frame). You'll then generate an instruction to do the appropriate computation and store in that location.

5. Deal with assignment statements. Most assignment statements will get translated as a move instruction (IMOV or FMOV).

6. Deal with output statements (print and println).

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Wed Nov 30 21:38:59 2011.
The source to the document was last modified on Wed Nov 30 21:38:58 2011.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS362/2011F/Handouts/phase-5.html.

Samuel A. Rebelsky, rebelsky@grinnell.edu