CSC 152 2005F, Class 34: Algorithm Analysis (1) Admin: * Walker observes * About CSC201 * Homework: Answer the questions at the end of today's outline ("I'm really not sure" is acceptable, but only after a few minutes of thought.) ("I'm a biologist and there's some good evidence that biologists don't even understand algebra.") Overview: * Comparing Algorithms * And some difficulties * Informal Analysis * Experimental analysis: Let's pretend we're a science Comparing Algorithms * Single problems admit large numbers of algorithms * Which of these N solutions do I choose? * And why? * To compute x^n + Repeatedly multiply by x (n times) + Recursion, splitting the expontent in half when even + x^n e^(lnx * n) * To find a particular value in a list/vector + Sequential search for (i = 0 ; i < names.length ; i++) { if names.get(i).equals("Eryn") return "Yay"; } + Binary search + If the Vector is ordered (smallest to largest) + Like a tree: Look in the middle If what we see matches, stop "Yay" If what we see is too large, look in the left half If what we see is too small, look in the right half * To compute the Nth Fibonacci number + Apply the formula fib(n) = fib(n-1) + fib(n-2) + Apply the formula, but cache values in a vector * Delete all elements from a vector, checking whether each meets some criterion and printing it if it does + int len = vec.length(); for (int i = 0; i < len; i++) { String foo = vec.remove(0); if (check(foo)) { pen.println(....); } } + for (int i = vec.length() - 1; i >= 0; i--) { String foo = vec.remove(i); if (check(foo)) { ... } } How do we decide which solution to use to each problem? * Accuracy/Correctness - Gives the result that it should * Running time: Choose the faster algorithm * Other attributes, such as memory usage * Ease of use * Versatility: Is it adaptable (automatically) to new situations * Ease of implementation Our primary analysis: Running time How do we analyze for running time? * Count the steps * Two "steps" may not take the same amount of time Multiplication vs. addition * Need to know steps for subroutines/other methods * Counting is sometimes hard * Number of steps is not consistent for the same input size * Conditionals affect running time Normal CS strategy: Approximate and bound * Goal: Find the "shape" of a curve that provides upper bound on running time First technique: Count * Each basic operation: 1 * Sequence of operations: Sum cost * Loop: Multiply the cost of the body by the number of reps * Conditional: Assume worst case (test plus worst of two branches) * Subroutines: Use analysis of those subroutines After counting, throw away constants Example, sequential search int i = 0; // 1 while // n repetitions n = names.length (i < names.length()) { // 1 + 1 if names.get(i).equals("Eryn") // 1 + 1 + 1 return "Yay"; // 1 i++; // 1 } Approximately: 1 + n * 7 steps It's really c + d*n + int n = vec.length(); // 1 for (int i = 0; i < n ; i++) { // n repetitions // 3 steps String foo = vec.remove(0); // n steps if (check(foo)) { // 1 step pen.println(....); // 1 step } } Approximately 1 + n(5 + n) steps Or c + dn + en^2 + for (int i = vec.length() - 1; i >= 0; i--) { String foo = vec.remove(i); if (check(foo)) { ... } } (define sum (lambda (lst) (if (null? lst) 0 (+ (car lst) (sum (cdr lst)))))) (define sum (lambda (lst) (if (null? lst) 0 (if (all-numbers? lst) (+ (car lst) (sum (cdr lst))) (error "I can't sum lists with non numbers"))))) Experimental analysis * Annotate your implementation to count steps * Compare to the formula * If the formula and the data points match * Your analysis is likely to be correct * If the formula and the data points don't match New homework * Implement the two versions of remove-all * Figure a way to time them compratively (wall clock, etc.)