CSC151 2010S, Class 49: Merge Sort
Overview:
* Questions on exam 3.
* More efficient sorting techniques.
* Divide and conquer, revisited.
* Merge sort.
* Analyzing merge sort.
* Lab.
Admin:
* For Tuesday and Wednesday, look at your classmates' work (URL distributed electronically).
* It will be easier for your classmates to review your work if you have it all together in a usable form.
* EC for Thursday's Convocation.
* EC for Sunday's Belly Dance performance (1:30 in Flanagan).
What did you learn about insertion sort?
* Can be done in vectors or lists
* Divide the world into "not yet processed" and
"sorted"
* Repeatedly insert into the correct place
in sorted
* On Wednesday, one group suggested finding
the correct place with binary search
* Why don't we use binary search with
insertion sort?
* What's binary search?
Given a sorted vector of values, find a
value using the following process:
* If there are no remaining elements, stop
* Look in the middle
* If you're lucky, you've found the value,
stop
* If the middle value is too small, recurse
on the right half
* If the middle value is too large,
recurse on the left half
* Unfortunately, while binary search will
let us quickly find the right place, it
doesn't give us room to insert the value
* How many steps does insertion sort take
in the worst case
* First element goes in place in zero steps
0 + 1 + 2 + 3 + ... + N-1 comparisons
So, ((N-1)*N)/2
* How fast does this grow?
* If we double the number of elements we're
sorting ... we spend four times as many
steps sorting
* Whoops! That's pretty slow. Can we do better?
* Big idea in binary search: Divide and conquer
* Split it in half and work on a half
* Sorting technique - Merge sorting
* Divide the group in half
* Sort each half
* Put the two sorted halves back together
by repeatedly grabbing the first element
of each half and taking the smaller of
the two
* N steps to merge the two sorted lists
Analysis when N is 16
16 to merge
+ 2 * sort a list of size 8
when N is 8
8 to merge
+ 2 * sort a list of size 4
when N is 4
4 to merge
+ 2 * sort a list of size 2
when N is 2
1 step
16 + 2*(8 + 2*(4 + 2*1))
16 + 16 + 16 + 8 =~ 56
When there are 32 people
2*56 + 32 = 144 or so
When there are 64 people
2*(144) + 64 =~ 350
Worse than doubling, but not much worse
* Much better than N(N-1)/2
Math detour (close your brains if you wish):
The time for this algorithm is N*log_2(N)
Can we find a faster algorithm
* Proof: If your basic operations are compare
and swap, you can't do better than C*N*log_2(N)
Observation: "Library Sort" doesn't use compare and swap, can we teach computers library sort?
* Sometimes; it depends a lot on the kind of data we have
Another technique: Quicksort
* Divide the group into small and large halves
* How?
* Sort each half
* Just shove them back together
In the worst case, Quicksort is about N^2 steps
In the average case, Quicksort is about
Nlog_2(N) steps
* It uses less memory than merge sort
* And appears a bit faster in practice