CSC152 2004F, Class 26: Algorithm Analysis (3) Admin: * Today concludes our initial explorations of algorithm analysis. We will apply these techniques to particular algorithms after break. * Reminder: Class tomorrow focuses on design. * Reminder: No class Wednesday or Friday. * Math/CS lunch table Cowles PDR North. * Russ Motta story and comments for you folks: * "Oh shit, what did we do wrong now?" * "Good luck, I'm thinking about you." Overview: * Review * Experimental analysis * Using the notation * Eliminating constants * Otherwise simplifying the "g" * Analyzing running time of recursive functions /Review/ * Goal: Analyze running time for algorithms, focusing on the shape of the curve * Mathematical notation "big O" f(n) is in O(g(n)) (informally, f(n) is "bounded above" by g(n)) if and only if exist n0 and a d such that |f(n)| <= |d*g(n)| for all but finitely many n > n0 * The n0 serves as "for big enough inputs" * The d serves as "we don't care about multipliers, we only care about shape" * The f is "running time" * The n is "input size" /Using big-O notation/ * Analyze (usually by hand) * Algorithm with better big-O running time is better to use for big inputs. * Experiment! * Verify that the curve is shaped like we expect it to be. * If two algorithms have the same big-O running time, we may want to compare the "hidden" multipliers /Applying big-O notation to simplify the g/ * Formalizations let us be precise. * Formalizations let us show that the simplifications we use are "valid" (within the formalization). * Claim: if f(n) is O(c*g(n)) then f(n) is O(g(n)) * Claim, restated: O(g(n)) = O(c*g(n)) for any c > 0 * Why would this claim be useful? * Simplify "big" functions. In particular, we don't have to write the constant multiplier (we can write O(n) rather than O(5n)) * Proof of claim: * How do we prove that if f(n) is in O(c*g(n)) then f(n) is in O(g(n))? Using the definition. * f(n) is in O(c*g(n)) exist d0 and n0 s.t for all n > n0, |f(n)| <= |d0*c*g(n)| * Goal: To find d1 and n1 s.t. for all n > n1, |f(n)| <= |d1*g(n)| * Let n1 = n0, d1 be d0*c |f(n)| = 100n * Proof: Let n0 be 100, let d be 1 f(n) = 100*n, g(n) = n*n >= 100*n Therefore |f(n)| <= |g(n)| * Question: is n^2 in O(100n)? * No: When n is big enough, n^2 dominates * Variations of O(n) * o(n) "bounded below", like O except |f(n)| >= |d*g(n)| * Theta(n) "bounded exactly", |d0*g(n)| <= f(n) <= |d1*g(n)| * Note that there are many problems for which we can figure out an upper bound, but we don't know a good lower bound and certainly don't know a tight bound. * Example: The travelling salescritter problem. TS has to visit N cities TS knows the highway distance between any pair of cities TS wants to find the shortest route that goes through all the cities Naive solution: Find all routes through the cities and choose the shortest Number of possible routes: N * (N-1) * (N-2) * ... * 3 * 2 * 1 (N!) That's a lot of routes. O(best algorithm) is in O(N!) Question: is there an O(N^100) algorithm? No one knows! * This is an example of a case in which we can identify big-O, but not Theta * Question: is n^2 in O(100n)? * No: When n is big enough, n^2 dominates * Prove it! Proof by contradiction. Assume it is in O(100n). Find a contradiction. * If n^2 is in O(100n) then exist n0, d s.t. |n^2| <= |d*100*n| for all n > n0 * Hmm .. what if n is 101*d? n^2 = 101*d*101*d, LARGER! But the claim said that it's supposed to be smaller. d*100*n = d*100*101*d * What if n0 is > 101*d? Same problem. * The mathematicians want more careful analysis. The computer erased it. Sorry. It was an elegant proof by induction. O(n) is a proper subset of O(n^2) * Yay! Constants can be eliminated and the bounds make sense. Other interesting things: * You can throw away "lower-order terms" * O(n^2 + n) = O(n^2) * O(n + log_2(n)) = O(n) * O(log_2(n) + 5) = O(log_2(n)) * big O is transitive if f(n) is in O(g(n)) and g(n) is in O(h(n)) then f(n) is in O(h(n)) Most important things: * throw away constant multipliers and lower-order terms Typical running times: * O(1): Independent of input size * Built-in operations * Find the second element of a Scheme list * O(log_2(n)): Logarithmic, E.g., binary search * O(n): Linear, E.g., smallest * O(n*log_2(n)): nlogn, E.g., mergesort * O(n^2): Quadratic, E.g., insertion sort * O(2^n): Exponential * O(n!): Also exponential See you tomorrow.