CSC153 2004S, Class 20: Algorithm Analysis (1)
Admin:
* Questions on homework 4?
* Try working with the arrays directly, and not just converting to lists and back
* Optional reading available this afternoon.
* Algorithm design and hop
* Notes on tail recursion
* What's the benefit of tail recursion, don't you still need the stack?
* No, a good implementation can replace the arguments "in place"
* So what's the difference in efficiency between tail-recursion and iteration?
* There is none
* So why bother with tail-recursion? (Why not just have iteration.)
* Because the designers of Scheme didn't want to include iteration. (Well ... more "Sam didn't want to teach you the iteration operation in Scheme.")
* Some algorithms are more natural to solve recursively. (Are there algorithms that are more naturally tail-recursive than iterative?)
* Tail-recursion is a natural stage between general recursion and iteration.
* Often easier to reason about tail-recursive procedures.
Overview:
* Choosing between different algorithms that solve the same problem
* Example: suml vs. sumr
* Example: Verified sum
* Measuring the running time of an algorithm
* Problems
* Asymptotic analysis
Given two algorithms that solve the same problem, how do you decide which
is "better".
* The one that is quicker to write.
* The one that runs more quickly. <--- The key issue
* The one that is easier to read.
* The one that uses the least memory.
* Microsoft solution: The one that does everything, including making sandwiches
* The one that is shortest (least code)
* The one that works correctly (we'll assume they both work correctly)
* The one that is most user friendly (however, they don't interact with the user)
* The one easiest to modify for other problems.
* The one easiest to prove correct.
Given two algorithms, how do you decide which is "faster"?
* Time them, running on the same input
* Unfortunately, the number of steps an algorithm takes on the same size input is not consistent
* Unfortunately, an algorithm that works better on small inputs does not necessarily work better on large inputs
Comparing suml to sumr:
* For an input of size 100,000, suml takes about .1 seconds and sumr takes about .16 seconds
* For an input of size 200,000, suml takes about .2 seconds and sumr takes about .8 seconds
* For an input of sizee 400,000, suml takes about .4 seconds and sumr takes about 2 seconds
Observations:
* The running time of suml is directly proportional to the size of the array
* The running time of sumr doesn't seem to be as proportional
The Mathematician's solution: Come up with a more general model for the *shape*
of the curve (running time vs. input)
* Why: The general shape of the curve will tell you more about the comparative running time on larger inputs
* Why: Sometimes general shape is easier to compute
How do we model the shape of a curve?
* Scientist: Experimentally
* Better scientist or Mathematician: Analyze the program to see what shape curve you expect
Theory:
* If you take "worst" inputs
* And you look at big enough inputs that other factors don't come into play
* Then most algorithms have a natural "upper bound" curve
* Which may or may not be easy to find
How we estimate that running time, informally
* For iterative programs (no recursion):
+ Count any "basic action" (addition, multiplication, input, etc) as one step
+ For a conditional, take the more expensive of its two branches and add the cost of the comparison
+ For a loop: Multiply the worst number of repetitions by the cost of the body
+ For a function call: Do the same analysis of the function call
procedure selection-sort(array-of-ints numbers)
for i = 0 to numbers.length
swap(numbers[i], indexOfSmallest(numbers, i)
procedure indexOfSmallest(array-of-ints numbers, int start)
guess = start;
for j = start + 1 to numbers.length
if numbers[j] > numbers[guess]
guess = j;
return guess;
How long does this take?
number.length repetitions * (1 for array index + 1 for swap + cost of indexOfSmallest)
indexOfSmallest (numbers.length repetitions of about 3 steps)
numbers.length * (2 + 3*numbers.length)
Because this is such a rough estimate, it's silly to pay attention to the constant addition and multipliers. The shape of the curve is N^2. The running time is O(numbers.length^2)
Let's talk about shape more formally. Big-O notation.
O(f(n)) is a set of functions that are "bounded above" by the shape of f(n) for large enough inputs
g(n) is in O(f(n)) if there exist n0, c such that
|c*g(n)| <= f(n) for all n > n0
n0 is the "for big enough input"
c is "the shape of the curve matters, not the constant multiplier"
2*n^2 is in O(n^2)
Proof: Let c be 1/2 and n0 be 1
1/2 * 2 * n^2 <= n^2 for all n > 1
How does this relate to running time?
We describe the running time of algorithms in terms of big-O notation.
That running time of selection sort (above) is in O(n^2)
Big-O notation is nice because it lets us
(1) Ignore constants
(2) Ignore smaller-order terms
Thm: If g(x) is d*f(x) (d not 0) then g(x) is in O(f(x))
Proof: Let c = 1/d, n0 = 0
c*g(x) = 1/d * g(x) = 1/d * d * f(x) = f(x) <= f(x)
Result: We never say that the running time of an algorithm is, say O(3x).
We instead ignore the constant multiplier and say its O(x)
Thm: if g(x) is in O(f(x)) then h(x) = f(x) + g(x) is in O(f(x))
Proof by nodding
Result: We never have to say the running time is O(n^2 + n) but rather
O(n^2)
Note: This is a *very* rough estimate.
How do we get more accurate?
(1) little-o for lower bounds
(2) theta notation for "exact" bounds
(3) Start being more of a scientist
Note that you usually want to identify the closest possible bounding function.
If f(n) is in O(n^2) it is also in O(n^3) and O(n^100)
How do we analyze a recursive procedure? More or less the same thing:
When you count the steps for the recursive call, you're stuck with a recursive
definition of time.
(define selection-sort-numbers
(lambda (vec)
(let kernel ((pos 0))
(if (< pos (vector-length vec))
(begin
(swap vec pos (index-of-smallest vec pos))
(kernel (+ pos 1)))))))
Running time of the kernel on a vector of length n:
1 for conditional
1 for swap
n (length of vector) for index-of-smallest
cost of running kernel on n-1 values
Technique for analysis: Recurrence relations
Have a good day (woo hoo!)