Fund. CS II (CS152 2005S)

Exam 3: Hashing, Vectors, and Sorting

Distributed: Friday, May 6, 2005
Due: 11:00 a.m., Friday, May 13, 2005
No extensions.

This page may be found online at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2005S/Exams/exam.03.html.

Contents

Preliminaries

There are four problems on the exam. Some problems may have subproblems. Each problem is worth 25 points. The point value associated with a problem does not necessarily correspond to the complexity of the problem or the time required to solve the problem.

This examination is open book, open notes, open mind, open computer, open Web. However, it is closed person. That means you should not talk to other people about the exam. Other than that limitation, you should feel free to use all reasonable resources available to you. As always, you are expected to turn in your own work. If you find ideas in a book or on the Web, be sure to cite them appropriately.

Although you may use the Web for this exam, you may not post your answers to this examination on the Web (at least not until after I return exams to you). And, in case it's not clear, you may not ask others (in person, via email, via IM, by posting a please help message, or in any other way) to put answers on the Web.

This is a take-home examination. You may use any time or times you deem appropriate to complete the exam, provided you return it to me by the due date.

This exam is likely to take you about six hours, depending on how well you've learned topics and how fast you work. You should not work more than eight hours on this exam. Stop at eight hours and write There's more to life than CS and you will earn at least 80 points on this exam. I would appreciate it if you would write down the amount of time each problem takes. Each person who does so will earn two points of extra credit. I expect that someone who has mastered the material and works at a moderate rate should have little trouble completing the exam in a reasonable amount of time. Since I worry about the amount of time my exams take, I will give two points of extra credit to the first two people who honestly report that they've spent at least five hours on the exam or completed the exam and do so at least two days before the exam is due. (At that point, I may then change the exam.)

You must include both of the following statements on the cover sheet of the examination. Please sign and date each statement. Note that the statements must be true; if you are unable to sign either statement, please talk to me at your earliest convenience. You need not reveal the particulars of the dishonesty, simply that it happened. Note also that inappropriate assistance is assistance from (or to) anyone other than Professor Rebelsky (that's me).

1. I have neither received nor given inappropriate assistance on this examination.
2. I am not aware of any other students who have given or received inappropriate assistance on this examination.

Because different students may be taking the exam at different times, you are not permitted to discuss the exam with anyone until after I have returned it. If you must say something about the exam, you are allowed to say This is among the hardest exams I have ever taken. If you don't start it early, you will have no chance of finishing the exam. You may also summarize these policies. You may not tell other students which problems you've finished. You may not tell other students how long you've spent on the exam.

You must both answer all of your questions electronically and turn in a printed version of your exam. That is, you must write all of your answers on the computer, print them out, number the pages, put your name on every page, and hand me the printed copy. You must also email me a copy of your exam by copying your exam and pasting it into an email message. Put your answers in the same order as the problems. Please write your name at the top of each sheet of the printed copy. If you fail to do so, you will be penalized.

In many problems, I ask you to write code. Unless I specify otherwise in a problem, you should write working code and include examples that show that you've tested the code.

Just as you should be careful and precise when you write code and documentation, so should you be careful and precise when you write prose. Please check your spelling and grammar. Since I should be equally careful, the whole class will receive one point of extra credit for each error in spelling or grammar you identify on this exam. I will limit that form of extra credit.

I will give partial credit for partially correct answers. You ensure the best possible grade for yourself by emphasizing your answer and including a clear set of work that you used to derive the answer.

I may not be available at the time you take the exam. If you feel that a question is badly worded or impossible to answer, note the problem you have observed and attempt to reword the question in such a way that it is answerable. If it's a reasonable hour (before 10 p.m. and after 8 a.m.), feel free to try to call me in the office (269-4410) or at home (236-7445).

I will also reserve time at the start of classes next week to discuss any general questions you have on the exam.

Problems

Problem 1: Deletion in Linear-Probe Hash Tables

Topics: Hash Tables, Arrays, Data Structure Design

Expected time: One hour

The implementation of hash tables that we've used in class has been what is typically called a bucket-style hash table. That is, each element of the array is essentially a bucket of values. The primary advantage of these kinds of hash tables is their ease of implementation. One disadvantage is that they waste a lot of space (for the nodes in the buckets and for the unused buckets).

An alternative implementation is the so-called linear probe hash table. In this implementation, when you attempt to hash a value to an already-filled cell, you keep trying subsequent cells until you find an empty one.

For example, suppose we're using integers as keys and our hash table has size ten.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |  |  |  |  |  |
+--+--+--+--+--+--+--+--+--+--+

If we hash 5, it goes into cell 5.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |05|  |  |  |  |
+--+--+--+--+--+--+--+--+--+--+

If we hash 17, it goes into cell 7 (17 is 7 mod 10).

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |05|  |17|  |  |
+--+--+--+--+--+--+--+--+--+--+

If we now hash 25, we first try to put it into cell 5 (25 is 5 mod 10). That cell is full, so we try cell 6. That cell is empty, so we put it there.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |05|25|17|  |  |
+--+--+--+--+--+--+--+--+--+--+

If we now hash 35, we first try to put it into cell 5 (35 is 5 mod 10). That cell is full, so we try cell 6. That cell is also full, so we try cell 7. That cell is also full, so we try cell 8. That cell is, fortunately, empty, so we put it there.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |05|25|17|35|  |
+--+--+--+--+--+--+--+--+--+--+

If we hash 49, we first try to put it into cell 9 (49 is 9 mod 10). That cell is empty, so we put it there.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |05|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

If we hash 53, we first try to put it into cell 3. That cell is empty, so we put it there.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |53|  |05|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

Suppose we next try to hash 68. We first try cell 8. That's full with 35. We next try cell 9. That's full with 49. We must then wrap around to the beginning (cell 0). That cell is empty, so we put it there.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|  |  |53|  |05|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

Suppose we next try to hash 70. We first try cell 0. That's full with 68. We next try 1. That's empty, so we put it there.

 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|70|  |53|  |05|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

How do we find something? We start looking at the cell corresponding to the modified hash value and then step through neighboring cells until (a) we find the key, in which case we're done, or (b) we hit an empty space, in which case we report that the key/value pair is not in the hash table.

For example, to find 35, we first look in cell 5. 35 is not there, so we try cell 6. 35 is not there, so we try cell 7. 35 is not there, so we try cell 8. 35 is there, so we're done.

Similarly, to find 100, we first look in cell 0. 100 is not there, so we try cell 1. 100 is not there, so we try cell 2. Cell 2 is empty, so we note that the key of 100 is not in the hash table.

To find 99, we first look in cell 9. 99 is not there, so we wrap around and try cell 0. 99 is not there, so we try cell 1. 99 is not there, so we try cell 2. Cell 2 is empty, so we fail.

Now that you understand the basics of linear-probe hash tables, it's time for you to consider a key issue: removing elements.

Describe in psuedocode how one removes a key/value pair from a linear-probe hash table.

Note: Removal is somewhat complex. Consider what happens if we remove 25, 49, or 35 from the table above.

Warning: It is not acceptable to say re-hash everything in the hash table or a variant thereof.

Warning: You may not change the behavior of insert or find as you implement removal. In particular, you may not use the Tsuvian solution of putting a placeholder there that find says is not a match and insert says is empty.

Problem 2: Dynamic Arrays

Topics: Arrays, Vectors, Data Structure Implementation

Expected time: Two hours

In many of the problems we've encountered, it would help to have arrays that automatically expand when we need them to. While Vectors solve that problem, we don't know how Vectors are implemented (which led some of you to assume that add is an O(1) operation when it is likely to be an O(n) operation). Vectors also have some annoying features. Hence, it makes sense to design our own variant.

a. Create a DynamicArray interface that supports the following operations. Your methods should accept all non-negative positions. (That is, unlike Vectors, your DynamicArrays should automatically expand with set.) Make sure to document each method fully.

b. Implement DynamicArray using Java arrays as the underlying structure. For this implementation, make sure that the pos for set and get is the same as the index in the array. You will need to expand the array for larger positions in set. (For get, you can simply return some default values, such as null.)

Problem 3: Testing Sorting

Topics: Sorting, Testing

Expected time: Two hours, if Java is nice to you

Useful Files:

In the class on Tuesday, 3 May 2005, we sketched a strategy we could use to test a sorting algorithm. I've started that implementation in OOPVSTester.java and some other related files.

a. Finish implementing the test method of OOPVSTester.

b. Test each of the four sorting routines.

Problem 4: Stable Sorting

Topics: Sorting, Code Reading

Expected time: One hour

Useful Files:

a. Each of the files linked above provides an implementation of an out-of-place sorting routine. Determine which of these implementations are stable and which are not. You may do this analysis by whatever means you choose (try lots of examples, study and understand the code, whatever). Explain how you arrived at your results.

b. For those that are not stable, explain how to make the sort stable. If you believe that a sort cannot be made stable, explain why.

Some Questions and Answers

These are some of the questions students have asked about the exam and my answers to those questions.

Problem 1: Deletion in Linear-Probe Hash Tables

Will you accept working Java code for the algorithm?
Yes. However, I do expect comments in the code.
What should we do if a key to be removed appears more than once in the hash table?
A key should never appear more than once. That's part of the idea of hashing.
In the pictures like the following, what are the numbers?
 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |81|  |  |14|  |  |  |  |  |
+--+--+--+--+--+--+--+--+--+--+
The numbers above the "table" are the indices. The numbers in the table are the keys.
What should happen if we delete 5 in the following?
 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|70|  |53|  |05|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+
Here's one solution. (Sorry, the explanation was live.)
 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|70|  |  |53|  |25|35|17|68|49|
+--+--+--+--+--+--+--+--+--+--+
Suppose we instead deleted the 05 in
 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|70|  |53|  |05|25|17|35|46|
+--+--+--+--+--+--+--+--+--+--+
One solution is
 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|70|  |  |53|  |25|35|17|46|68|
+--+--+--+--+--+--+--+--+--+--+
However, it is equally acceptable to do
 0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|70|  |  |53|  |25|46|17|35|68|
+--+--+--+--+--+--+--+--+--+--+
Can we leave ghosts in the machine?
No. Ghosts scare me. They are also too Tsuvian for me.

Problem 2: Dynamic Arrays

What do you mean by set? Is the same as add in Vectors?
I mean that you should replace the value at a particular index. It is therefore not the same as add, which must shift values.
Then what do you mean by expand the array?
Suppose we currently have an underlying primitive array of 10 elements and someone calls set(16,"hello"). The primitive array must now grow to a size at least 17.
I'm having problems with parameterized types in this problem. In particular, I don't seem to be able to create an array using the type variable.
Java seems to discourage folks from creating arrays using type variables. The best workaround I've found is to create an array of objects, cast it to an array of the specified type, and ignore the warnings. For example,
T[] stuff = (T[]) new Object[10];
I will provide extra credit to someone who comes up with a better solution.

Problem 3: Testing Sorting

Does our sorting tester have to work for any class that implements OutOfPlaceVectorSorter?
Yes.
May I test success by checking that the result is a permutation and that the result is sorted?
Yes. However, I'd very much prefer the more-efficient mechanism we discussed in class (fill the Vector with a group of known values and make sure there result contains those values in order).

Problem 4: Stable Sorting

Can I fix an implementation in such a way that I change its asymptotic running time (e.g., from O(log2n) to O(n2))?
Certainly not.
Would you accept I found this Web page that says the sorting routine is inherently unstable as proof?
Sure. However, you can't trust everything you read on the Internet. If you're wrong, having cited someone else who is wrong buys you nothing.
Can you define "stable" again?
A sort is stable if it preserves the relative order of equal elements. For example, if we're sorting only by last name and we've previously sorted in decreasing order by first name, then after sorting, a stable sort will have John Smith before Jane Smith.

Errors

Here you will find errors of spelling, grammar, and design that students have noted. Remember, each error found corresponds to one point of extra credit for everyone. I limit such extra credit to five points. After the first five points, each five errors correspond to one additional point of extra credit.

Extra Credit

Here I record particularly special forms of extra credit.

 

History

Late April and Early May 2005 [Samuel A. Rebelsky]

Thursday, 5 May 2005 [Samuel A. Rebelsky]

Friday, 6 May 2005 [Samuel A. Rebelsky]

Monday, 9 May 2005 [Samuel A. Rebelsky]

Tuesday, 10 May 2005 [Samuel A. Rebelsky]

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Wed May 11 10:55:15 2005.
The source to the document was last modified on Tue May 10 09:51:29 2005.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2005S/Exams/exam.03.html.

You may wish to validate this document's HTML ; Valid CSS! ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu