CSC151 2010S, Class 49: Merge Sort Overview: * Questions on exam 3. * More efficient sorting techniques. * Divide and conquer, revisited. * Merge sort. * Analyzing merge sort. * Lab. Admin: * For Tuesday and Wednesday, look at your classmates' work (URL distributed electronically). * It will be easier for your classmates to review your work if you have it all together in a usable form. * EC for Thursday's Convocation. * EC for Sunday's Belly Dance performance (1:30 in Flanagan). What did you learn about insertion sort? * Can be done in vectors or lists * Divide the world into "not yet processed" and "sorted" * Repeatedly insert into the correct place in sorted * On Wednesday, one group suggested finding the correct place with binary search * Why don't we use binary search with insertion sort? * What's binary search? Given a sorted vector of values, find a value using the following process: * If there are no remaining elements, stop * Look in the middle * If you're lucky, you've found the value, stop * If the middle value is too small, recurse on the right half * If the middle value is too large, recurse on the left half * Unfortunately, while binary search will let us quickly find the right place, it doesn't give us room to insert the value * How many steps does insertion sort take in the worst case * First element goes in place in zero steps 0 + 1 + 2 + 3 + ... + N-1 comparisons So, ((N-1)*N)/2 * How fast does this grow? * If we double the number of elements we're sorting ... we spend four times as many steps sorting * Whoops! That's pretty slow. Can we do better? * Big idea in binary search: Divide and conquer * Split it in half and work on a half * Sorting technique - Merge sorting * Divide the group in half * Sort each half * Put the two sorted halves back together by repeatedly grabbing the first element of each half and taking the smaller of the two * N steps to merge the two sorted lists Analysis when N is 16 16 to merge + 2 * sort a list of size 8 when N is 8 8 to merge + 2 * sort a list of size 4 when N is 4 4 to merge + 2 * sort a list of size 2 when N is 2 1 step 16 + 2*(8 + 2*(4 + 2*1)) 16 + 16 + 16 + 8 =~ 56 When there are 32 people 2*56 + 32 = 144 or so When there are 64 people 2*(144) + 64 =~ 350 Worse than doubling, but not much worse * Much better than N(N-1)/2 Math detour (close your brains if you wish): The time for this algorithm is N*log_2(N) Can we find a faster algorithm * Proof: If your basic operations are compare and swap, you can't do better than C*N*log_2(N) Observation: "Library Sort" doesn't use compare and swap, can we teach computers library sort? * Sometimes; it depends a lot on the kind of data we have Another technique: Quicksort * Divide the group into small and large halves * How? * Sort each half * Just shove them back together In the worst case, Quicksort is about N^2 steps In the average case, Quicksort is about Nlog_2(N) steps * It uses less memory than merge sort * And appears a bit faster in practice