Eboard 24: Sorting
You are being recorded and transcribed.
Approximate overview
- Administrivia
- Check in on Wednesday’s class
- Some notes on sorting
- Testing sorting algorithms
- Loop invariants
Preliminaries
Upcoming work
- Friday, 2024-03-15, 11pm, Third set of LAs.
- Mostly posted.
- Let me know if any look wrong.
- Let me know if you’d like to see any others posted.
- Grading is still on my priority queue, near the front.
Tokens
Academic/Scholarly
Cultural
- Friday at 4pm in the Global Living Room.
Middle of Everywhere.
- I don’t know whether or not that’s happening.
Peer
Wellness
Misc
Other good things to do (no tokens)
Spring Break PSA
- You are awesome. Please take care of yourselves.
- And don’t get me fired for sharing information with you.
Check in on Wednesday
What did you learn?
- Merge sort.
- Merge algorithm: Merge two sorted lists (or subarrays)
- Merge sort - If the list has more than one element, divide the unsorted
list in half, sort the two halves, merge them together.
- Can I merge in place (e.g., [1 ,4, 6, 9, 2, 3, 5, 8])?
- Nope. I need to make a helper array to copy things into.
- This is a disadvantage of merge sort: It requires O(n) extra space.
- Do I have to use merge sort when I sort each subarray?
- How do you prove it correct?
- Experimentally, we run tests
- We could do a proof. With invariants.
- Is merge sort stable?
- Stable: If there are two “equal” elements, they maintain their order.
(That may have been in the reading.)
- No: It depends a bit on how you merge.
- Can we decide on an approach that maintains stability? Yes!
When we merge, always take the element from the left array first.
Some notes on sorting
- Stability
- We all sort all the time.
- MSort: Divide the cards into two piles: Sorted and Unprocessed.
For each card in unprocessed, put it into the right place in sorted.
- ASort: Similar, with a probabilistic (ML-based) approach for finding
the right location.
- Insertion sort
- Analysis of insertion sort (incorrect)
- We have to insert
n
cards from unprocessed list into the sorted
list.
- Finding the right location takes O(logn) time because we can use
binary search.
- Inserting it into that position is constant time.
- This algorithm is O(nlogn).
- Why is this incorrect? (Hint: It may be O(nlogn) in the physical world,
but not in the computer world.)
- If the data are stored in an array, it will take O(n) to insert,
not O(1) because you have to shift stuff over to make room.
- If the data are stored in a link list, it only takes O(1) to insert
once you’ve found the spot, but be can’t use binary search on lists,
since binary search requires quick access based on index.
- Analysis: O(n^2) for arrays because O(n) to insert.
- Analysis: O(n^2) for lists because O(n) to find.
- Reminder: Once you’ve correctly analyzed an algorithm, you have two questions
to ask:
- Can I do better?
- If not: Can I prove that I can’t do better?
- Detour: For the “can I prove”, Travelling Salescritter Problem
- Given a set of
n
cities, find the shortest path connecting them all.
- Easy solution: Make a list of all the paths, take the shortest.
- Unfortunately, there are O(n!) paths, which grows way too quickly.
- In 301, you’ll get to consider whether a sorting algorithm can be faster
than O(nlogn). (Preview: Not for a compare/swap algorithm.)
- In 341, you’ll get to explore questions like TSP.
- For insertion sort’s “can I do better”
- Design a different algorithm
- Design a new data structure, one that has both quick insertion and
quick searching.
- A better data structure: The binary search tree
- A binary tree is a structure in which you have a node (with a value),
left subtree, and right subtree.
- Usually the left subtree has values less than the node and the
right subtree has values greater than the node.
- Both left and right subtrees have the same property.
- If the tree is mostly balanced (that is, we have left and right subchildren
at most levels; the two subtrees are about the same size ), the number
of levels is O(logn).
- Search is O(logn)
- Insert is O(logn)
- Maintaining the property is hard.
- We will return to this issue in a few weeks.
Testing sorting algorithms
You’ve written a sorting algorithm. Proofs are hard. You want tests.
What kind of “write lots of tests” algorithm would you write?
Systematically
- Use for loops to build arrays of different sizes
- For each size array
Random
- Use a for loop to build arrays of different sizes
- Sort each
- And make sure it’s in order AND a permutation of the original
Sam’s Sorting algorithm (which gives you an array in order)
```
public static T samSort(T[] values, Comparator order) {
for (int i = 1; i < values.length; i++) {
values[i] = values[0];
}
} // samSort
``
More ideas
- Start with a sorted array.
- Permute it (or permute a copy) [shuffle; rearrange]
- Sort the permutation
- Compare the two arrays.
Even more
- Start with a sorted array.
- Sort it.
- See if it’s the same. (Some non-stable sorting algorithms will rearrange
equal elements, so for this, we want to ensure that equal elements are
treated as equal.)
Whoops!
Loop invariants