Held: Thursday, 6 May 2004
Today we look at some of the issues involved in the improvement of program
code, focusing primarily on intermediate code.
- That's right: Lab today is used for lecture.
- Custom course eval forms available this afternoon. Please fill one out by tomorrow.
- Tomorrow we meet at Dairy Barn.
- Time for your final questions on the project.
- You are now at the top of my grading heap.
- Improvement basics.
- Analyzing basic blocks.
- Generating better assembly.
- Some common optimizations.
- Since the first compiler, one of the goals of compilation is to
nearly as good as that hand coded by a
- However, it is often better to generate less-good code on
the first pass.
- It may require less effort to generate that code.
- An early optimization at one point may prevent optimizations at other points.
- The original code may be more readable.
- The original code may be easier to prove equivalent to the original program.
- While it is possible for a compiler to improve its code
on the fly, as it generates code from portions of the annotated parse tree (or as it parses the program), it is often easier to improve the code in a separate pass.
- Traditionally, code improvement is called optimization,
although the code it produces is often not optimal
optimal as in
you can't be any better than this.
optimizations are guaranteed to produce unimprovable code because it may always be possible to choose a better algorithm (especially if you know more about the input data)
- Code improvement can typically be done at many stages in a program:
- Source code
- You can choose better algorithms.
- You can identify slower parts of the program and recode them.
- You can turn tail-recursion into iteration (if your compiler doesn't do it for you).
- You can choose better data types
- Intermediate code
- Many of the transformations we'll talk about today.
- Code generation
- Select registers well
- Select translations well
- Target code
Peephole improvement (look at a few lines at a time)
- Algorithmic improvements are often among the most important. For
some problems, the improvement in algorithms over the past thirty years
outpaced the improvements in processors.
- You were better off running the newer algorithm on older
equipment than running the older algorithm on newer equipment.
- It's also worthwhile thinking about the data. For example, integer calculations are typically much faster than real calculations.
- It's also worth thinking about the arrangement of arrays. In most systems, one of the following two nested loops will run much faster for very large arrays. Can you tell why?
for i := 1 to N do
for j := 1 to N do
B[i,j] := A[i,j];
for j := 1 to N do
for i := 1 to N do
B[i,j] := A[i,j]
- These improvements can be broken down into local/peephole improvements (look at a small section of code and make it better) and more global improvements (reorder the flow of control, duplicate some code rather than do a procedure call, ...).
- Many optimizations (both local and global) require an understanding of control flow and data flow in the program.
- So that's where we'll begin.
- In order to do most optimizations, it's important to get a sense of
the control and data flow in a program. We do this by breaking
the program up into so called
basic blocks and edges between
those basic blocks.
- A basic block is a sequence of consecutive statements in which
control enters only at the top and leaves only at the end. That
is, you're guaranteed to execute all the instructions in the
block if you enter the block.
- How to build basic blocks:
- Identify the possible starts of basic blocks (called leaders)
- Anything between a leader and the next leader (not including the
next leader) is a basic block.
- How to identify leaders:
- The first statement is a leader
- The target of a branch is a leader
- Anything immediately following a branch is a leader
- As we've seen in our discussion of register analysis, for each
piece of code (e.g., statement,
series of statements, block, and group of blocks),
we might consider the variables that provide values that are
used in the code (uses) and those variables that have
values set by the code (defines).
- It's fairly easy to specify uses and defines for
individual instructions (e.g., for
ADD b c a, the instruction uses b and c and
- We should then consider how uses and defines might
be defined for groups of statements. We might find that
we have to define variants of uses and defines
to deal with the interaction of types of use. In particular, what
should we do about sequences of instructions?
- What about groups of blocks?
- How should we define a group of blocks?
- How should we find groups of blocks?
- How should we define uses and defines for
groups of blocks.
- As I suggested earlier, many of the issues for building good
programs require a careful understanding of the underlying
- It also requires some understanding of how data are
- For example, if we've been using a three-parameter
add (as I like to do), there are many
ways to translate that to a two-parameter add,
depending on what you already have in registers.
- Different but similar code can require different numbers
of cycles. For example, given variables a, b, and c, all of
which are stored in memory, how
should we implement
ADD b c -> a
- We can put
b in a register, add to that
register, and move into
- We can put
c into registers,
do register addition, and move the result into
- Which is faster? It depends on the relative costs of the
different kinds of move and add operations.
- If any of the values are already in registers, we can eliminate
one or more of the moves.
- We'll leave such careful consideration as a future topic.
- Once we've broken the program down into blocks and thought about
the relationships between blocks, we can then
do some simple transformations on the program.
- Common subexpression elimination
Suppose we have a series of statements like
and there are no changes to y and z in the ellided code. Then we
can replace the latter operation with
ADD y z -> x
ADD y z -> a
mov x a.
- This kind of optimization is possible between blocks, but can
be more difficult. Consider the case in which we have three
identical assignments in three separate blocks, two of which
lead to the third. If there are no other ways to enter the
third block, we can use a copy rather than a computation.
- In practice, we may even add a new variable and assign to
In order to replace the final
ADD a b -> c
ADD a b -> d
ADD a b -> e
ADD by a
MOVE, we need to add another variable
and some extra work.
ADD a b -> t
MOVE t -> c
ADD a b -> t
MOVE t -> d
MOVE t -> e
- Copy propagation Whenever we have code for the
mov x y, we can use
y in subsequent statements (until
y is updated).
- Constant propagation. Whenever a container (memory location,
register, or temporary) contains a known value, use that known value
rather than the container,
- Dead-code elimination If no code reachable from
a statement that defines x uses x, then we don't need to execute
that statement, and can eliminate it.
- Note that careful copy propagation may lead to dead
- Statement swapping (interchange) Given two subsequent
statements, s1 and s2, if the intersection of uses(s1) and defines(s2)
is empty and the intersection of uses(s2) defines(s1) is empty, then
we can swap s1 and s2.
- Why is this helpful? It can lead to other transofrmations.
- Algebraic transformations. We can rewrite some mathematical
operations with more efficient code. For example, rather than
multiplying by 2, we can add to itself or left shift one bit (depending
on the underlying implementation of integers).
- Induction variables and reduction in strength: When two variables change together (as in the following code), use a simpler operation on the second.
ADD $1 t -> t
MUL t $4 -> u
ADD $1 t -> t
ADD $4 u -> u