# Class 23: Asymptotic Analysis

Back to Program Documentation. On to Pointers and Functions.

Held: Wednesday, 26 February 2003

Summary: Today we consider mechanisms for analyzing the running time of algorithms.

Assignments:

Notes:

• Sorry about neglecting the outline for yesterday's class.
• This outline was created post-course after an impromptu discussion of asymptotic analysis.

## Comparing Algorithms

• As you may have noted, there are often multiple algorithms one can use to solve the same problem.
• In finding the minimum element of a list, you can step through the list, keeping track of the current minimum. You could also sort the list and grab the first element.
• In finding xy, one might use repeated multiplication, divide and conquer, or even the built-in en and natural log procedures.
• You can come up with your own variants.
• How do we choose which algorithm is the best?
• The fastest/most efficient algorithm.
• The one that uses the fewest resources.
• The clearest.
• The shortest.
• The easiest to write.
• The most general.
• ...
• Frequently, we look at the speed. That is, we consider how long the algorithm takes to run.
• It is therefore important for us to be able to analyze the running time of our algorithms.

## Difficulties Analyzing Running Times

• Is there an exact number we can provide for the running time of an algorithm?
• Surprisingly, no.
• Different inputs lead to different running times. For example, if there are conditionals in the algorithm (as there are in many algorithms), different instructions will be executed depending on the input.
• Not all operations take the same time. For example, addition is typically quicker than multiplication, and integer addition is typically quicker than floating point addition.
• The same operation make take different times on different machines.
• The same operation may appear to take different times on the same machine, particularly if other things are happening on the same machine.
• Many things that affect running time happen behind the scenes and cannot be easily predicted. For example, the computer might move some frequently-used data to cache memory.

## Asymptotic Analysis

• Noting problems in providing a precise analysis of the running time of programs, computer scientists developed a technique which is often called asymptotic analysis. In asymptotic analysis of algorithms, one describes the general behavior of algorithms in terms of the size of input, but without delving into precise details.
• The analysis is asymptotic in that we look at the behavior as the input gets larger.
• There are many issues to consider in analyzing the asymptotic behavior of a program. One particularly useful metric is an upper bound on the running time of an algorithm. We call this the Big-O of an algorithm.
• Big-O is defined somewhat mathematically, as a relationship between functions.
• f(n) is in O(g(n)) iff
• there exists a number n0
• there exists a number d > 0
• for all n > n0, abs(f(n)) <= abs(d*g(n))
• What does this say? It says that after a certain value, n0, f(n) is bounded above by a constant (that is, d) times g(n).
• The constant, d, helps accommodate the variation in the algorithm.
• We don't usually identify the d precisely.
• The n0 says for big enough n.
• We can apply big-O to algorithms.
• n is the size of the input (e.g., the number of items in a list or vector to be manipulated).
• f(n) is the running time of the algorithm.
• Some common Big-O bounds
• An algorithm that is in O(1) takes constant time. That is, the running time is independent of the input. Getting the size of a vector should be an O(1) algorithm.
• An algorithm that is in O(n) takes time linear in the size of the input. That is, we basically do constant work for each element of the input. Finding the smallest element in a list is often an O(n) algorithm.
• An algorithm that is in O(log_2(n)) takes logarithmic time. While the running time is dependent on the size of the input, it is clear that not every element of the input is processed. Many such algorithms involve the strategy of divide and conquer.

## Eliminating Constants

• One of the nice things about asymptotic analysis is that it makes constants unimportant because they can be hidden in the d.
• If f(n) is 100*n seconds and g(n) is 0.5*n seconds, then
• f(n) is in O(g(n)) [let d be 200]
• g(n) is in f(n) [let d be 1]
• If f(n) is 100*n seconds and g(n) is n*n seconds, then f(n) is in O(g(n)) [let n0 be 100 and d be 1]
• However, g(n) is not in O(f(n)). Why not?
• Suppose there were an n0 and a d.
• Consider what happens for n = 101*d.
• d*f(n) = d*100*101*d = d*d*100*101.
• However, g(n) = d*d*101*101, which is even larger.
• If n0 is greater than 101d, we'll still have this problem [proof left to reader].
• Since constants can be eliminated, we normally don't write them.
• That is, we say that the running time of an algorithm is O(n) or O(n2) or ....

## Asymptotic Analysis in Practice

• We now have a theoretical grounding for asymptotic analysis. How do we do it in practice?
• At this point in your career, it's often best to count the steps in an algorithm and then add them up. After you've taken combinatorics, you can use recurrence relations.
• Over the next few days, we'll look at a number of examples. Some starting ones.
• Finding the smallest/largest element in a vector of integers.
• Finding the average of all the elements in a vector of integers.
• Putting the largest element in an array at the end of the array. if we're only allowed to swap subsequent elements.
• Computing the nth Fibonacci number.
• ...

## The Role of Details

• Although the previous discussion may make it seem like precise details of algorithms aren't important, some details are particularly important.

## Eliminating Constants

• One of the nice things about asymptotic analysis is that it makes constants unimportant because they can be hidden in the d.
• If f(n) is 100n seconds and g(n) is 0.5n seconds, then
• f(n) is in O(g(n)) [let d be 200]
• g(n) is in f(n) [let d be 1]
• If f(n) is 100n seconds and g(n) is n2 seconds, then f(n) is in O(g(n)) [let n0 be 100 and d be 1]
• However, g(n) is not in O(f(n)). Why not?
• Suppose there were an n0 and a d.
• Consider what happens for n = 101d.
• d*f(n) = d*100*101*d = d*d*100*101.
• However, g(n) = d*d*101*101, which is even larger.
• If n0 is greater than 101d, we'll still have this problem [proof left to reader].
• Since constants can be eliminated, we normally don't write them.
• That is, we say that the running time of an algorithm is O(n) or O(n2) or ....

## Doing Big-O Analysis

• Most computer scientists begin their study of Big-O by analyzing simple iterative functions.
• Here's how you start:
• Count each step as one.
• Count each conditional as taking one plus the maximum of the number of steps in each branch.
• Count each loop as taking the number of repetitions times the cost of the body (perhaps plus one).
• Count each procedure call as taking one plus the number of steps in the procedure call.
• You can then simplify the result by dropping lower-order terms and eliminating constants.

## Relating Functions

• Note that some of the counting can be hard, so we may also need to estimate throughout the process.
• We can often express the running time of an algorithm as a composition of other functions.
• For example, we might have a primary control structure which repeats its contents O(h(n)) times and within that control structure, we have a subalgorithm that takes O(g(n)) time.
• Similarly, we might break our algorithm up into two independent algorithms, one of which takes O(h(n)) time and one of which takes O(g(n)) time.
• We might also look at other relationships.
• Here are some questions we might ask ourselves about composition of functions.
• If f(n) is in O(h(n)) and g(n) is O(k(n)), then what can we say about f(n) + g(n)?
• If f(n) is in O(h(n)) and g(n) is O(k(n)), then what can we say about f(n) * g(n)?
• If f(n) is in O(g(n)) and g(n) is O(h(n)) then what can we say about f(n) and h(n)?
• If f(n) is in O(kg(n)) and k is a constant, then what else can we say about f(n) and g(n)?
• If f(n) is in O(g(n) + h(n)) and g(n) is in O(h(n)), then what simpler thing can we say about f(n)?

## Analyzing Recursive Procedures

• Unfortunately, the techniques for analyzing algorithms we discussed above focuse primarily on iterative algorithms. What about recursive algorithms?
• The process essms inherently more difficult.
• However, we can use a similar strategy.
• Count the number of recursive calls.
• Count the time spent during each recursive call (except for the time spent in subsequent recursive calls).
• How do you count the number of recursive calls? Often, the shrink function gives you a sense of how many recursive calls you need to do
• That is, how many calls to shrink does it take to reduce the argument to the base case.
• It is also possible to use the concept of recurrence relations to determine the running time of an algorithm.
• For example, if a recursive algorithm takes c steps and reduces the parameter by one, we might express the running time as a function, time(n) which represents the running time on input of n elements:
• time(n) = c + time(n-1)
• time(1) = c
• This equation is relatively easy to solve "intuitively".
• time(n) = c + time(n-1)
• = c + c + time(n-2)
• = ...
• = c + c + ... + time(1)
• = c + c + ... + c [n times]
• = c*n
• Other interesting recurrence relations:
• time(n) = n + time(n-1)
• time(n) = c n + time(n-1)
• time(n) = c + time(n/2)
• time(n) = n + time(n/2)
• time(n) = c * time(n-1)
• time(n) = c * time(n/2)

## Experimental Analysis

• Particularly as you start to analyze algorithms, it may be helpful to analyze them experimentally as well as abstractly.
• What do I mean when I say analyze them experimentally? I mean that we can time them on a variety of inputs and graph the results.
• If the experimental and the abstract results match, we can be fairly confident in the abstract results.
• If they don't, we may need to try again.
• Note that some analyses can be very difficult.

## History

Tuesday, 7 January 2003 [Samuel A. Rebelsky]

• Created generic version to set up course.

Wednesday, 26 February 2003 [Samuel A. Rebelsky]

• Filled in details.

Wednesday, 27 February 2003 [Samuel A. Rebelsky]

• Filled in details again, after impromptu class session. These details are primarily taken from other class outlines I've written for other classes.

Back to Program Documentation. On to Pointers and Functions.

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Fri May 2 14:20:36 2003.
The source to the document was last modified on Thu Feb 27 23:31:11 2003.
This document may be found at `http://www.cs.grinnell.edu/~rebelsky/Courses/CS195/2003S/Outlines/outline.23.html`.

You may wish to validate this document's HTML ; ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu