EBoard 18: More Prep for Assignment 6
Approximate overview
- Admin
- Modifying recursive merge sort
- Writing merge sort iteratively
- Other notes on the assignment
- Dictionaries
Administrative stuff
- Report from Dean: 2/3 of students are experiencing significant stress.
- More than normal for this point of the semester.
- It’s fine if you’re part of the 1/3.
- If you’re part of the 2/3, you’re not alone.
- It would be nice to have more folks there for LFA. This week’s was
also great.
- If the SEPC contacts you for an interview, please participate.
- Quicksort can be made stable (stabilized?) by using helper arrays.
Instead of partitioning in place, you put smaller and larger values
into the helper arrays.
Upcoming token-generating activities
- Convo Thursday: Profs. John Garrison and Dustin Dixon
- CS Extras 4pm Thursday: CLS
- Prof. Eliott’s Coffee Chat in ELBICA lab (across from CS Learning Center)
some Fridays 5-6 p.m.
- Arcadia, October 8–10. (7 p.m. Friday and Saturday, 2 p.m. Sunday.)
(I think.)
- Theatre. Show by Tom Stoppard. Mathy.
- Ticketing Wednesday
- Vegan Potluck, Saturday at 7pm
- Participate in SEPC interview about faculty under review.
Upcoming other activities
Upcoming work
- HW5 due Thursday night. Implement
AVL trees in Racket or Java.
- HW6 due next Thursday.
- Implement the best stable sorting algorithm you can. (Within
time reason.) Make sure to include comments to explain what
you did.
- Implement radix sort for strings of up to eight letters.
- Implement a hybrid bucket sort for strings, bucketing
on the first two characters and then sorting with whatever
sorting algorithm you choose.
- We’ll continue to look at parts in class today.
Today’s code
- Find the code in either examples/sorting or
sort.tar.
- Our merge sort was correct, other than my mis-typing the parameter.
- I fixed the random stuff.
Some questions
Why do you have non-inclusive upper-bounds for procedures that
take a range?
Long-standing experience suggests I make fewer off-by-one errors
when I do so.
The Scheme procedures I use regularly, like substring, use that
policy.
The policy works well with passing in the length of something.
I’d rather type subsort (stuff, 0, len) than
subsort (stuff, 0, len-1).
(See prior note about off-by-one errors.)
The policy works well for many divide-and-conquer array
algorithms. recurse (stuff, lb, mid); recurse (stuff, mid, ub);
(See prior note about off-by-one errors.)
What is going on with rms_kernel(strings+halfsize, len-halfsize, scratch);?
It’s the joy of C! strings is a pointer (an array of char *s).
Hence strings+halfsize is also a pointer. In this case, it gives
us the slice of the array starting at strings[halfsize].
pointer + x advances a pointer over x things and gives you the
new thing it points to.
We are going to use insertion sort if the input size is small and
merge sort if the input is large. But the time complexity of merge
sort is better.
Suppose merge sort is 1000*nlogn and insertion sort is 5*n*n.
If n is 5, merge sort will take about 10,000 steps, insertion
sort will take 125 steps. Which would you prefer? Moral one:
The multiplier matters.
Suppose we call merge sort on { 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 }.
Merge sort doesn’t take advantage of the ordering. However, insertion
sort will only need to shove the 1 left, doing only O(n) work.
Is Tim Sort broken?
The original Tim Sort used a stack for saved computations. It made
incorrect assumptions on the size of that stack. If you overflowed
that stack, it broke.
Most implementations of Tim Sort have fixed that issue by increasing
the size of the stack, either statically or dynamically.
You need not understand Tim Sort to do this assignment.
Modifying recursive merge sort
- Make it iterative
- Using the traditional iterative merge sort algorithm.
- Managing the stack ourselves.
- Rewrite the merge algorithm
- Split into parts
- Use insertion sort for small enough arrays
Writing merge sort iteratively
Basic idea:
- Merge neighboring blocks of size 1 to create sorted blocks of size 2.
- Merge neighboring blocks of size 2 to create sorted blocks of size 4.
- Merge neighboring blocks of size 4 to create sorted blocks of size 8.
- …
Complexities:
- Sometimes there’s no right block to merge.
- Sometimes the right block is smaller than the left block.
Basic framework
for (int size = 1; size < ???; size += size)
{
for (int pos = 0; pos < ???; pos += ???)
{
if (???)
{
merge (strs, ???, ???, ???);
}
} // for
} // for size
If you use iterative merge sort, I might start size at 32 or more,
and first insertion sort on each block of size 32.
Other notes on the assignment
Goals for YouSort
- Open-ended algorithm experimentation / design!
- Not quite traditional “research”, but it’s close.
- Try not to spend more than an hour or two on this.
Goals for radix sort
- Prove to me that you can write a radix sort.
- First order by rightmost character (character 7)
- Then by the next character to the left (character 6)
- …
- Then by the first character.
- I would use 27 arrays.
ithchar will give you the ith character in a string.
- Goal: < 3 hours
Goals for bucket sort
- Prove to me that you can write a bucket sort.
- (26+1)*(26+1) buckets
- Put each word in the appropriate bucket.
- Sort the buckets.
- Put them back together.
- Goal: < 3 hours