Approximate overview
If we do the Scheme/Racket version of HW5, what should we do?
You need to rewrite
node-insert!so that it rebalances after inserting. You need to add akthprocedure, that finds thekthvalue you would see in left-to-right inorder. You need to check the heights for various input sizes.
Tell me more about kth.
As you know, we can number the nodes in the tree according to a depth-first, left-to-right, in-order traveral. kth gives the node that would be numbered k in such a traversal.
Your kth shoul be O(height), which for balanced trees is O(logn)
Any hints on rewriting node-insert!?
There’s a “cleanup” procedure (
node-update-attributes!) that gets called after nodes are inserted in the left or right subtree. You could have that check for imbalance at that node and rebalance.
Should we just write a separate “balanced-node-insert!” procedure?
Great idea! Thanks for the suggestion. Anything that lets you compare is fine.
If our tree is a BST, does kth give the kth-smallest element?
Yes.
Why did you give us a preorder traversal if inorder is the important thing?
Um. I’m not sure. Lack of sleep?
How do I rebalance?
See Knuth and/or your notes from Friday.
What about the kth element?
See Knuth.
What happens if I use more tokens than I have?
You incur a token debt.
What happens if I have a token debt at the end of the semester?
Your grade decreases.
How much?
You don’t want to find out.
None of the sorting algorithms we traditionally use is better than O(nlogn).
As an algorithm designer, you should have at least three followup questions:
We’ll do the latter two, as strange as that is. Better algorithms will be next class. We’ll start with the theorem today.
Theorem: All comparison-based sorting algorithms have inputs that require Omega(nlogn) running time.
Proof? You get to consider how to approach it.
Claim: Every comparison-based sorting algoirthm for a particular size can be represented by a decision tree.
The tree for an input of size n must have a different leaf for each permutation of [1,2,…,n]. There are n! permutations. By logic!
There are n choices for the first slot. n-1 choices for the next slot, n-2 choices for the next slot. Etc. So n*n-1*n-2*…*2*1 = n!.
Proof of lemma. By contradiction.
Suppose there were fewer leaves than permutations.
There exists a leaf that is used for two separate inputs. (Pigeonhole principle.)
Two separate inputs will differ in at least two positions.
But the leaf must permute both of them the same.
So if we put smaller in the first position and larger in the second position in the first input, and larger in the first position and smaller in the second position in the second input, one must be unsorted at the end.
Hence, the tree does not always sort. That contradicts our assumption that we could build a tree that sorts with fewer than n! leaves.
Therefore, a tree that sorts any array of size n must have n! leaves.
Proof by direct construction.
n! >= (n/2)^(n/2)
n! = n \* n-1 \* n-2 \* ... \* n/2 \* (n/2-1) \* .... \* 1
>= n \* n-1 \* n-2 \* ... \* n/2
>= n/2 \* n/2 \* n/2 \* ... \* n/2
>= (n/2)^(n/2)
Theorem: All comparison-based sorting algorithms have inputs that require Omega(nlogn) running time.
Proof:
Our sorting tree for input of size n has n! leaves. (By Lemma above)
Assume that tree has height k.
The shortest possible tree (a complete binary tree) has 2^(k-1) leaves.
2^(k-1) >= n! >= (n/2)^(n/2)
2^(k-1) >= (n/2)^(n/2)
Compute the log of both sides
log2(2^(k-1)) >= log2((n/2)^(n/2)), assuming 2^(k-1) is at least 2.
log2(2^a) = a
log2(x^y) = ylog2x
k-1 >= n/2(log2(n/2)) >= n/2(log2n - 1)
k >= n/2(log2n - 1) + 1
k in Omega(n*log2n)
k represents the number of comparisons we had to do in our sorting algorithm.
Thereofore, any sorting algorithm based on comparisons is Omega(n*log2n)
Q.E.D.