CS152 2006S : Class 33: Algorithm Analysis (1)

Class 33: Algorithm Analysis (1)

Held: Monday, April 3, 2006

Summary: Today we consider techniques for comparatively evaluating the running time of algorithms.

Related Pages:

Assignments

Notes:

The exam will be distributed on or before Friday and due the following Friday.

Today we have a guest lecturer. More details in email.

Overview:

An Example: Exponentiation.

Comparing Algorithms.

Asymptotic Analysis.

Motivating Problem: Exponentiation

Consider the problem of computing x^y for double x and positive integer y.

How do we do it? We've seen two strategies, one using a for loop, one using a clever recursive solution.

Simple iterative solution: a for loop.

double result = 1.0;
for (int i = 0; i < y; i++)
  result *= x;

Divide-and-conquer solution

To compute x^y
If y is 0
  return 1
Else if y is odd
  return x*x^(y-1)
Else if y is even
  return square(x^(y/2))

How much better is the second algorithm (or is it better)?

Are there other, perhaps more efficient, algorithms we can use?

Comparing Algorithms

As you may have noted, there are often multiple algorithms one can use to solve the same problem.

In finding the minimum element of a list, you can step through the list, keeping track of the current minimum. You could also sort the list and grab the first element.
In finding x^y, one might use repeated multiplication, divide and conquer, or even the built-in eⁿ and natural log procedures.
You can come up with your own variants.

How do we choose which algorithm is the best?

The fastest/most efficient algorithm.
The one that uses the fewest resources.
The clearest.
The shortest.
The easiest to write.
The most general.
...

Frequently, we look at the speed. That is, we consider how long the algorithm takes to run.

It is therefore important for us to be able to analyze the running time of our algorithms.

Difficulties Analyzing Running Times

Is there an exact number we can provide for the running time of an algorithm?

Surprisingly, no.

Different inputs lead to different running times. For example, if there are conditionals in the algorithm (as there are in many algorithms), different instructions will be executed depending on the input.

Not all operations take the same time. For example, addition is typically quicker than multiplication, and integer addition is typically quicker than floating point addition.

The same operation make take different times on different machines.

The same operation may appear to take different times on the same machine, particularly if other things are happening on the same machine.

Many things that affect running time happen behind the scenes and cannot be easily predicted. For example, the computer might move some frequently-used data to cache memory.

Asymptotic Analysis

Noting problems in providing a precise analysis of the running time of programs, computer scientists developed a technique which is often called asymptotic analysis. In asymptotic analysis of algorithms, one describes the general behavior of algorithms in terms of the size of input, but without delving into precise details.

The analysis is asymptotic in that we look at the behavior as the input gets larger.

There are many issues to consider in analyzing the asymptotic behavior of a program. One particularly useful metric is an upper bound on the running time of an algorithm. We call this the Big-O of an algorithm.

Informally, Big-O gives the general shape of the curve of the graph of running time vs. input size.

That is, is it linear, quadratic, logarithmic, ...

Big-O is defined somewhat mathematically, as a relationship between functions.

f(n) is in O(g(n)) iff

there exists a number n₀
there exists a number d > 0
for all but finitely many n > n₀, abs(f(n)) <= abs(d*g(n))

What does this say? It says that after a certain value, n₀, f(n) is bounded above by a constant (that is, d) times g(n).

The constant, d, helps accommodate the variation in the algorithm.
We don't usually identify the d precisely.
The n₀ says for big enough n.

We can apply big-O to algorithms.

n is the size of the input (e.g., the number of items in a list or vector to be manipulated).
f(n) is the running time of the algorithm.

Some common Big-O bounds

An algorithm that is in O(1) takes constant time. That is, the running time is independent of the input. Getting the size of a vector should be an O(1) algorithm.
An algorithm that is in O(n) takes time linear in the size of the input. That is, we basically do constant work for each element of the input. Finding the smallest element in a list is often an O(n) algorithm.
An algorithm that is in O(log₂(n)) takes logarithmic time. While the running time is dependent on the size of the input, it is clear that not every element of the input is processed. Many such algorithms involve the strategy of divide and conquer.

Asymptotic Analysis in Practice

We now have a theoretical grounding for asymptotic analysis. How do we do it in practice?

For iterative algorithms, it's often best to count the steps in an algorithm and then add them up.

Assume most things take one step.
If you call a function, you'll need to analyze the running time of that function
For loops, analyze the body of the loop and then multiply by the number of times the loop repeates.
For conditionals, analyze the two halves of the conditional and take the largest.

We may find other ways to count, too.

After you've taken combinatorics, you can use recurrence relations for recursive functions.

We may do some informal recurrence relations.

Over the next few days, we'll look at a number of examples. Some starting ones.

Finding the smallest/largest element in a vector of integers.
Finding the average of all the elements in a vector of integers.
Putting the largest element in an array at the end of the array. if we're only allowed to swap subsequent elements.
Binary search
...

[Skip to Body]
Primary: [Front Door] [Current] [Glance] - [Honesty] [On Teaching and Learning]
Groupings: [EBoards] [Examples] [Exams] [Handouts] [Homework] [Labs] [Outlines] [Readings] [Reference]
Misc: [SamR] [Java 1.5 API] [Espresso] [TAO of Java] [CS152 2004F] [CS152 2005S] [CS152 2005F]

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue May 9 08:31:45 2006.
The source to the document was last modified on Thu Jan 12 14:58:06 2006.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2006S/Outlines/outline.33.html.

You may wish to validate this document's HTML ; ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu