CSC 207.02 2019S, Class 17: Analyzing recursive algorithms
Overview
- Preliminaries
- Notes and news
- Upcoming work
- Extra credit
- Questions
- Quiz
- Big-O, revisited
- Iterative analysis, revisited
- Recurrence relations
- Approaches to recurrence relations
Preliminaries
News / Etc.
- Welcome to any prospective students we have. Thank you for bringing
warmer weather with you.
- I’m back! I hope that you had a good time without me. I apologize
for the inconsistency in communication.
- I brought you conference swag. (One of each item per person.)
- Blake says “Be proud that you are able to think technically and talk
about it,.”
Upcoming work
- Assignment 5 due Tuesday night.
- Exam 1 to be distributed in concrete form tonight. Sorry for the delay
in getting it out.
- Prologue due Thursday night
- Exam due the following Thursday.
- Reading for Wednesday:
Anonymous functions
(to be posted tonight)
- Lab writeup: [None]
- March 8-10 (7:30 7:30 2:00), Twelfth Night. Box office opens today
at noon.
- Grinnell Singers March 10 at 2pm.
- 30 Minutes of Mindfulness at SHACS every Monday 4:15-4:45
- Any organized exercise. (See previous eboards for a list.)
- 60 minutes of some solitary self-care activities that are unrelated to
academics or work. Your email reflection must explain how
the activity contributed to your wellness.
- 60 minutes of some shared self-care activity with friends. Your email
reflection must explain how the activity contributed to your wellness.
Other good things
- Environmental talk tonight at 7:30 in Noyce 2021. Sounds really cool.
(Appropriate for this weather.)
Questions
What’s the problem with the linear average algorithm?
Potential overflow!
When will we get assignments back?
Soon, I hope, except for the evil assignment (which you all got
25 on).
Sam will push on the graders. (Or maybe Sam will push on Sam.)
Quiz
Joy and fun, maybe.
Big-O analysis, revisited
What is Big O and why do we use it?
- A way to analyze programs in terms of how long they take
(or how much memory they use)
- A way to classify functions (e.g., linear, exponential)
E.g. A function in O(n) is “linear”.
- We write a function that models how long our algorithm takes
(how much memory it uses)
- We might want to compare our model to actual experiments.
- Big-Oh notation is used to provide upper bounds on functions
- Using a formally defined mechanism
- That lets us describe the overall “shape” of the bound of a function.
Formal definition
- f(n) is in O(g(n)) iff exist c > 0, n0 > 0, s.t. for all n > n0,
f(n) <= c*g(n).
- <= indicates the upper bound
- n > n0, “for sufficiently large n”. Compare 10000000*n vs n^2/100.
For small n, you should use the quadratic one, but when n is big
enough the n^2/100 dominates.
- c > c0, “we don’t care about constant multipliers; we care primarily
about the overall shape”
- Most of our analyses do not carefully distinguish between the various
costs of “constant time” operations (e.g., addition, multiply,
functional call are all “1 unit”)
The formal definition of Big O lets us prove a variety of important
properties of the notation.
- If f(n) is in O(g(n)) and g(n) is in O(h(n)), f(n) is in O(h(n))
- 5*n^2 is in O(n^2); it is also in O(n^44)
- We try to pick the tightest bound possible.
- How do we prove this?
- Set theory.
- Given the c and n0 for the first rule, c and n0 for the
second rule (d and n1), come up with a c and n0 (C and N0) for
the third rule. E.g., C = c*d and N0 = max(n0,n1)
- f(n) < cg(n), g(n) < dh(n), so cg(n) < cd*h(n)
- Then we have transitivity of <.
- If f(n) is in O(g(n)) then f(n)+g(n) is in O(g(n))
- You can throw away lower-order terms.
- E.g., 5n+n^2 is in O(n^2)
- c*f(n) is in O(f(n))
Iterative analysis, revisited
Normal techniques for bounding algorithms.
- Take a structure, have a rule for bounding that structure.
Bound on a sequence of steps is the sum of the bounds of the individual
steps.
a[0] = 1 // 1 step
a[0] = largest(a) // n steps
Bound on a for loop E.g.,
- Count the number of times the loop executes
- Count the cost of the body of the loop
- Multiply the two
selection_sort(int a) {
for (int i = 0; i < n; i++) {
swap(a[i], index_of_smallest(a, i))
}
}
// Find the location of the smallest element in the array,
// looking starting at start.
int index_of_smallest(int a, int start) {
...
}
- This loop executes n times
- In the body, we compare (hidden) [1], increment (hidden) [1],
swap [1], and compute the index of the smallest [n]
- The running time of this algorihtm is n(3+n) = 3n + n^2 in O(n^2)
What is the cost of a conditional?
if (test) {
consequent;
} else {
alternate;
}
if (test) {
x = x+1;
} else {
for (int i = 0; i < number_of_atoms_in_the_universe; i++) {
}
}
Cost is cost of test + max of cost of consequent and alternative
You can do a lot of analysis like this, but sometimes it’s helpful
to unroll your loops.
for (int i = n; i > 1; i = i/2) {
a[i] = smallest(a, i); // smallest looks at positions 0 ... i
} // for
- Number of repetitions: log_2(n) (also written as log(n) or logn)
- Cost per repetition: n
- Product: O(nlogn)
Unroll the loop
- First iteration: n
- Second iteration: n/2
- Third iteration: n/4
- Fourth iteration: n/8
- Kth iteration: n/2^(k-1)
What is n + n/2 + n/4 + n/8 + n/2^k (or what’s a bound on it?)
- k=0: n
- k=1: n + n/2 = 3n/2
- k=2: n + n/2 + n/4 = 7n/4
- k=3: n + n/2 + n/4 + n/8 = 15n/8
- k=4: n + n/2 + n/4 + n/8 + n/16 = 31n/16
- k=5: n + n/2 + n/4 + n/8 + n/16 + n/32 = 63n/32
- General: 2n - 1/2^k, approaches 2n
Suggestion
Notes
- We just concluded it’s in O(n)
- We previously concluded it’s in O(nlogn)
- The O(n) is a better (closer) bound.
Recurrence relations
As computer scientists, we often write recursive algorithms.
merge_sort(A) {
if (A.length <= 1) {
// Do nothing, it's sorted
}
else {
split array into two new subarrays A1 and A2
A1 = merge_sort(A1)
A2 = merge_sort(A2)
combine them back together
}
}
To analyze this algorithm, we’ll invent a function, T(n), that represents
the running time
- T(n) = 1 (for test) + n (to split) + T(n/2) + T(n/2) + n (to merge)
- T(n) = 1 + 2n + 2(T(n/2))
- T(1) = 1
- T(0) = 1
How do we figure out what functions this rule indicates? How do we find
the fixed form of this recursive formulation?
- T(n) = 1 + 2n + 2(T(n/2))
Approaches to recurrence relations
Let’s try a slightly simpler one. We’ll try repeated expansion
- T(x) = x + 2*T(x/2)
- T(n/2) = n/2 + 2*T(n/2/2)
- T(n/4) = n/4 + 2*T(n/4/2)
Following the steps
- T(n) = n + 2*T(n/2)
- T(n) = n + 2(n/2 + 2*T(n/4))
- T(n) = n + n + 4*T(n/4)
- T(n) = 2n + 4*T(n/4)
- T(n) = 2n + 4(n/4 + 2T(n/8))
- T(n) = 2n + n + 8T(n/8) = 3n + 8T(n/8)
- T(n) = kn + 2^k*T(n/2^k)
- When k is logn, 2^k is n. T(N) = nlogn + n*T(1) is in O(nlogn)