Held: Monday, 22 November 2004
Summary:
Today we consider two important divide-and-conquer sorting algorithms, merge sort and Quicksort.
Related Pages:
Notes:
- How many of you plan to miss class on Wednesday?
Overview:
- Review
- Divide and conquer sorting
- Merge sort
- Iterative merge sort
- Asymptotic analysis
- Quick sort
- Asymptotic analysis
- We've seen two main sorting algorithms so far
- In insertion sort, the key operation is inserting a value into a sorted subvector.
- In selection sort, the key operation is selecting the largest value.
- Both algorithms are O(n^{2})
- What do we need to do to ensure that both are stable (that they maintain order)?
- Can we do better? Certainly. We've seen that heap sort is O(nlogn).
- Is heap sort stable?
- Are there other alternatives to heap sort?
- Certainly: There are at least two other O(nlogn) sorts.
- Both are based on the divide-and-conquer strategy.
- In merge sort, we divide in half by position.
- In Quicksort, we divide in half by value.
- We'll look at both in some depth.
- The merge sort algorithm begins sensibly for a divide-and-conquer algorithm
if the portion of interest has one or fewer elements
we're done
otherwise
divide the portion of interest into two halves
sort each half
...
- For example, given the vector (5, 4, 3, 1, 6, 2)
- We would sort the (5, 4, 3) part, giving us (3, 4, 5)
- We would sort the (1, 6, 2) part, giving us (1, 2, 6)
- What do we do now? We must merge the two halves together
- 1 is smaller than 3, so it is the first value in our result
- 2 is smaller than 3, so it is the second value
- 3 is smaller than 6, so it is the third value
- 4 is smaller than 6, so it is the fourth value
- 5 is smaller than 6, so it is the fifth value
- only 6 remains, so it is the sixth value
- To merge two lists, we repeatedly compare the first value of each list and take the smaller of the two.
- A problem: Where do you put the result of merging?
- Answer: Merge Sort cannot be done in place; you must make a copy.
- Instead of doing merge sort in a top-down method, recursively dividing and conquering, we can work bottom up.
- First, merge neighboring elements into sorted pairs.
- Next, merge neighboring pairs into sorted four-tuples.
- Next, merge neighboring four-tuples into sorted eight-tuples.
- And so on and so forth.
- It's easier to analyze the iterative version.
- Doing all the sets of merges for one size takes O(n) (each step of each merge puts one element in "the right place" in its sublist)
- After logn "level merges", we have a sorted list of size n.
- Hence, merge sort is O(nlogn).
- The naive way to divide elements in merge sort leads to the need to do complex and out-of-place merging.
- Perhaps we can divide the elements more sensibly.
- In Quicksort, we partition the subvector into two parts: the "small" values and the "large" values.
- Every "small" value is smaller than every "large" value.
- The "small" values are at the left (lower indices) and the large values are at the right (higher indices)
- We then sort the small values in place and the large values in place
- The result is therefore sorted.
- For example, given the vector (5, 4, 3, 1, 6, 2)
- We rearrange the elements to (2, 1, 3, 4, 6, 5), where (2,1,3) are the small elements and (4, 6, 5) are the large elements.
- We sort the two halves.
- The result is (1, 2, 3, 4, 5, 6)
- A problem: How do we determine large and small? It turns out that random selections works pretty well.
- Pick some element of the vector and treat it is as the middle value.
- A problem: How do you partition?
- Step through from left to right and right to left, swapping whenever you encounter elements out of place.
- A subtlety: If you can have repeated values, how do you decide when you're done?
- Normal solution: Don't include the pivot (randomly selected element) in recursive sort
- It's easiest to assume that we divide in half each time.
- We come up with a recurrence relation. It takes about n steps to partition.
- To sort a list of n elements takes n steps for partitioning and some amount of time to sort each sublist of n/2 elements.
- In "math", we would write t(x) = k + 2*t(x/2), where t(x) is "time to sort x elements".
- Can we solve this?
- t(n) = n + 2*t(n/2)
- t(n) = n + 2*(n/2 + 2*t(n/4)) -- substituted n/2 + 2*t(n/4) for t(n/2)
- t(n) = n + n + 2*2*t(n/4) -- simplified
- t(n) = 2*n + 4*t(n/4) -- simplified again
- t(n) = 2*n + 4*(n/4 + 2*t(n/8)) -- substituted n/4 + 2*t(n/8) for t(n/8)
- t(n) = 2*n + n + 4*2*t(n/8)) -- simplified
- t(n) = 3*n + 8*t(n/8)) -- simplified again
- t(n) = k*n + 2^{k}*t(n/2^{k}) -- generalized
- t(n) = log_{2}n*n + n*t(n/n) -- substituted log_{2}n for k
- t(n) = nlog_{2}n + n*t(1) -- simplified
- t(n) is in O(nlog_{2}n)