CSC152 2004F, Class 26: Algorithm Analysis (3)
Admin:
* Today concludes our initial explorations of algorithm analysis. We will apply these techniques to particular algorithms after break.
* Reminder: Class tomorrow focuses on design.
* Reminder: No class Wednesday or Friday.
* Math/CS lunch table Cowles PDR North.
* Russ Motta story and comments for you folks:
* "Oh shit, what did we do wrong now?"
* "Good luck, I'm thinking about you."
Overview:
* Review
* Experimental analysis
* Using the notation
* Eliminating constants
* Otherwise simplifying the "g"
* Analyzing running time of recursive functions
/Review/
* Goal: Analyze running time for algorithms, focusing on the shape of the curve
* Mathematical notation "big O"
f(n) is in O(g(n)) (informally, f(n) is "bounded above" by g(n)) if and only if exist n0 and a d such that
|f(n)| <= |d*g(n)| for all but finitely many n > n0
* The n0 serves as "for big enough inputs"
* The d serves as "we don't care about multipliers, we only care about shape"
* The f is "running time"
* The n is "input size"
/Using big-O notation/
* Analyze (usually by hand)
* Algorithm with better big-O running time is better to use for big inputs.
* Experiment!
* Verify that the curve is shaped like we expect it to be.
* If two algorithms have the same big-O running time, we may want to compare the "hidden" multipliers
/Applying big-O notation to simplify the g/
* Formalizations let us be precise.
* Formalizations let us show that the simplifications we use are "valid" (within the formalization).
* Claim: if f(n) is O(c*g(n)) then f(n) is O(g(n))
* Claim, restated: O(g(n)) = O(c*g(n)) for any c > 0
* Why would this claim be useful?
* Simplify "big" functions. In particular, we don't have to write the constant multiplier (we can write O(n) rather than O(5n))
* Proof of claim:
* How do we prove that if f(n) is in O(c*g(n)) then f(n) is in O(g(n))? Using the definition.
* f(n) is in O(c*g(n)) exist d0 and n0 s.t for all n > n0, |f(n)| <= |d0*c*g(n)|
* Goal: To find d1 and n1 s.t. for all n > n1, |f(n)| <= |d1*g(n)|
* Let n1 = n0, d1 be d0*c
|f(n)| = 100n
* Proof: Let n0 be 100, let d be 1
f(n) = 100*n, g(n) = n*n >= 100*n
Therefore |f(n)| <= |g(n)|
* Question: is n^2 in O(100n)?
* No: When n is big enough, n^2 dominates
* Variations of O(n)
* o(n) "bounded below", like O except |f(n)| >= |d*g(n)|
* Theta(n) "bounded exactly", |d0*g(n)| <= f(n) <= |d1*g(n)|
* Note that there are many problems for which we can figure out an upper bound, but we don't know a good lower bound and certainly don't know a tight bound.
* Example: The travelling salescritter problem.
TS has to visit N cities
TS knows the highway distance between any pair of cities
TS wants to find the shortest route that goes through all the cities
Naive solution: Find all routes through the cities and choose the shortest
Number of possible routes: N * (N-1) * (N-2) * ... * 3 * 2 * 1 (N!)
That's a lot of routes.
O(best algorithm) is in O(N!)
Question: is there an O(N^100) algorithm?
No one knows!
* This is an example of a case in which we can identify big-O, but not Theta
* Question: is n^2 in O(100n)?
* No: When n is big enough, n^2 dominates
* Prove it! Proof by contradiction. Assume it is in O(100n). Find a contradiction.
* If n^2 is in O(100n) then exist n0, d s.t. |n^2| <= |d*100*n| for all n > n0
* Hmm .. what if n is 101*d?
n^2 = 101*d*101*d, LARGER! But the claim said that it's supposed to be smaller.
d*100*n = d*100*101*d
* What if n0 is > 101*d? Same problem.
* The mathematicians want more careful analysis. The computer erased it. Sorry. It was an elegant proof by induction.
O(n) is a proper subset of O(n^2)
* Yay! Constants can be eliminated and the bounds make sense.
Other interesting things:
* You can throw away "lower-order terms"
* O(n^2 + n) = O(n^2)
* O(n + log_2(n)) = O(n)
* O(log_2(n) + 5) = O(log_2(n))
* big O is transitive
if f(n) is in O(g(n)) and g(n) is in O(h(n)) then f(n) is in O(h(n))
Most important things:
* throw away constant multipliers and lower-order terms
Typical running times:
* O(1): Independent of input size
* Built-in operations
* Find the second element of a Scheme list
* O(log_2(n)): Logarithmic, E.g., binary search
* O(n): Linear, E.g., smallest
* O(n*log_2(n)): nlogn, E.g., mergesort
* O(n^2): Quadratic, E.g., insertion sort
* O(2^n): Exponential
* O(n!): Also exponential
See you tomorrow.