CSC152 2004F, Class 25: Algorithm Analysis (2) Notes: * Questions on the homework? * Email and present * It's not due until Tuesday * The UI group needs to write a story (reflect on how you typically interact with the user) and then reflect on what's happening in code behind the scenes * Warning! Lots of Math today. * Eryn likes "random plan love" * Eryn doesn't like her need for planlove being exhibited on the eboard Overview: * Status * Example, revisited * Formalizing these ideas: Asymptotic analysis * Asymptotic analysis in practice * Eliminating contants (if time) /Status/ * Discussing comparative analysis of algorithms * Decision: The algorithm that is faster "over the long haul" (on big inputs) is generally better * We need to be able to predict how fast an algorithm will run. * We care primarily about the "shape" of the curve of running time vs. input size * Example: Two algorithms for computing x^y * The first (repeated multiplication in a loop) does y multiplications * The second (strange recursion) seems to do only one more multiplication if we double the size of the exponent. * What function of y grows by one each time you double y? * Another way to look at it: The number of steps the second algorithm takes is "the number of times we have to divide y by 2 in order to get 1" log_2(y) * Log functions grow *very* slowly. We're happy when we have logarithmic functions as running times. Reminder: * We care more about shape than about constant multipliers * We care primarily about large inputs Time to write that mathematically O(g(n)) is a set of functions f(n) is a in O(g(n)) if and only if * There exist n0 and d > 0 such that * For all n > n0, |f(n)| < |d*g(n)| Think about the picture. * We don't care about small inputs, because sometimes curves don't cross until the inputs are large. * Mathematicians write: Exists n0, For all n > n0 * We care about the shape of the curve, not the particular multiplier * Mathematicians write: Exists d > 0, and d*g(n) We'll use cool math to prove things about the notation which will then make it easier to talk about the pictures. Note that g(n) is an "upper bound" of f(n) As computer scientists evaluating running times, we look for "tight" upper bounds. We will say "the running time of *this algorithm* is in O(*this function*)" How do we figure out what "this function" is for most algorithms? * Simple solution: Count (informally) * Iterative * Most operations take one step * Calls to other methods take however many steps those methods take * Loops: Count the steps in the body; count the worst number of repetitions, multiply the two * Conditionals: Count each part, take the worst, maybe add one for the test * Recursive * Math 218, Recurrence Relations (MAT335) * Write a function that describes running time in terms of itself. * Figure out the value of that function Examples (informal) // Find the smallest value in an unsorted array public int smallest(int[] values) { // Assume the first is the smallest int smallest = values[0]; // 1 step // Look at all remaining elements and update your assumption for (int i = 1; i < values.length; i++) { // n repetitions if (smallest > values[i]) { // 1 step smallest = values[i]; // 1 step } } // We're done return smallest; // 1 step } // smallest(int[]) Approximately O(n*2 + 2) The shape is O(n) // Selection sort: Put an array of values in order from smallest to largest. // Selection strategy: Repeatedly select the smallest element and swap it with the "current" element public void selectionSort(int[] values) { // For each position // N positions for (int i = 0; i < values.length; i++) // Find the smallest value starting at that poosition // N int smallpos = i; for (int j = i+1; j < values.length; j++) { // N repeatitions if (values[j] < values[smallpos]) { // 1 step smallpos = j; // 1 step } } // for // Swap with that position // 3 int tmp = values[i]; values[i] = values[smallpos]; values[smallpos] = tmp; } Approximately O(N*(N+1)) or O(N^2) Seems to be a misleading analysis, because finding the smallest gets faster and faster N + N-1 + N-2 + N-3 + ... + 3 + 2 + 1 What is that sum? N*(N+1)/2 also O(N^2) Binary search /** * Determine if target falls in the sorted array values. */ public boolean binarySearch(int target, int[] values) { int lb = 0; // 1 step int ub = values.length; // 1 step while (lb < ub) { // ? repetitions int mid = (lb + ub) / 2; // 1 step if (target == values[mid]) return true; // 1 step else if (target < values[mid]) ub = mid - 1; // 1 step else lb = mid + 1; // 1 step } // while return false; // 1 step } // values How many steps does this take? O(3 + 2*#repetitions) # of repeittions is O(log_2(n)) The algorithm is therefore O(3 + 3*log_2(n)) which is O(log_2(n)) After break, we'll do lots of sorting algorithm and have fun analyzing them. No more homework for today!