Fund. CS II (CS152 2005F)

Exam 3: Analysis and ADTs

Distributed: Wednesday, 23 November 2005
Due: 11:00 a.m., Friday, 2 December 2005
No extensions.

This page may be found online at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2005F/Exams/exam.03.html.

Contents

Code

You may find any or all of the following files useful.

Preliminaries

There are four problems on the exam. Some problems have subproblems. Each problem is worth twenty-five (25) points. The point value associated with a problem does not necessarily correspond to the complexity of the problem or the time required to solve the problem.

This examination is open book, open notes, open mind, open computer, open Web. However, it is closed person. That means you should not talk to other people about the exam. Other than as restricted by that limitation, you should feel free to use all reasonable resources available to you. As always, you are expected to turn in your own work. If you find ideas in a book or on the Web, be sure to cite them appropriately.

Although you may use the Web for this exam, you may not post your answers to this examination on the Web (at least not until after I return exams to you). And, in case it's not clear, you may not ask others (in person, via email, via IM, by posting a please help message, or in any other way) to put answers on the Web.

This is a take-home examination. You may use any time or times you deem appropriate to complete the exam, provided you return it to me by the due date.

I expect that someone who has mastered the material and works at a moderate rate should have little trouble completing the exam in a reasonable amount of time. In particular, this exam is likely to take you about four to six hours, depending on how well you've learned topics and how fast you work. You should not work more than eight hours on this exam. Stop at eight hours and write There's more to life than CS and you will earn at least 75 points on this exam.

I would also appreciate it if you would write down the amount of time each problem takes. Each person who does so will earn two points of extra credit. Since I worry about the amount of time my exams take, I will give two points of extra credit to the first two people who honestly report that they've spent at least five hours on the exam or completed the exam. (At that point, I may then change the exam.)

You must include both of the following statements on the cover sheet of the examination. Please sign and date each statement. Note that the statements must be true; if you are unable to sign either statement, please talk to me at your earliest convenience. You need not reveal the particulars of the dishonesty, simply that it happened. Note also that inappropriate assistance is assistance from (or to) anyone other than Professor Rebelsky (that's me).

1. I have neither received nor given inappropriate assistance on this examination.
2. I am not aware of any other students who have given or received inappropriate assistance on this examination.

Because different students may be taking the exam at different times, you are not permitted to discuss the exam with anyone until after I have returned it. If you must say something about the exam, you are allowed to say This is among the hardest exams I have ever taken. If you don't start it early, you will have no chance of finishing the exam. You may also summarize these policies. You may not tell other students which problems you've finished. You may not tell other students how long you've spent on the exam.

You must both answer all of your questions electronically and turn in a printed version of your exam. That is, you must write all of your answers on the computer, print them out, number the pages, put your name on every page, and hand me the printed copy. You must also email me a copy of your exam by copying the various parts of your exam and pasting it into an email message. Put your answers in the same order as the problems. Please write your name at the top of each sheet of the printed copy. Failing to do so will lead to a penalty of two points.

In many problems, I ask you to write code. Unless I specify otherwise in a problem, you should write working code and include examples that show that you've tested the code.

Just as you should be careful and precise when you write code and documentation, so should you be careful and precise when you write prose. Please check your spelling and grammar. Since I should be equally careful, the whole class will receive one point of extra credit for each error in spelling or grammar you identify on this exam. I will limit that form of extra credit to five points.

I will give partial credit for partially correct answers. You ensure the best possible grade for yourself by emphasizing your answer and including a clear set of work that you used to derive the answer.

I may not be available at the time you take the exam. If you feel that a question is badly worded or impossible to answer, note the problem you have observed and attempt to reword the question in such a way that it is answerable. If it's a reasonable hour (before 10 p.m. and after 8 a.m.), feel free to try to call me in the office (269-4410) or at home (236-7445).

I will also reserve time at the start of classes next week to discuss any general questions you have on the exam.

Problems

Problem 1: Recurrence Relations

Topics: Asymptotic analysis, recurrence relations

Expected time: One hour

For each of the following pairs of equations, determine a tight Big-O bound on t(n). Note that I have used inequalities rather than equalities. However, the inequalities should have no significant impact on your bounding of the functions.

1a.

1b.

1c.

1d.

Problem 2: Sets

Topics:: ADT Design, Implementing Data Structures

Expected time: Two hours

Most of the collections we've studied so far are used primarily for storing elements that we can then retrieve (by index in a list or vector, by the get operation in a linear structure, by repeated deletion in a linear structure; by a traversal algorithm in a tree). However, other kinds of collections naturally appear in computation. For example, mathematical sets are a common ADT. A set is a collection of elements whose primary operations are

a. Express this collection of operations as a type-parameterized interface, SetOf<T>. [5 points; 30 minutes]

b. Implement the interface using binary search trees. [10 points; one hour]

c. Experimentally determine the relationship of the depth of the tree to the size of the tree. That is, for a variety of numbers, put that many randomly-generated Integers into the tree and determine the depth of the tree and, using the size/depth pairs you've generated, see if there's a pattern. [10 points; 30 minutes]

Note: Although set designers often include a few advanced set operations, such as union, intersection, and difference, you will not be expected to include or implement those operations. Although it also makes sense to include a remove operation, that operation is comparatively complex, so I have skipped it. Focus on the three basic operations.

Problem 3: Heaps

Topics: Linear Structures, Heaps, Sorting, Asymptotic Analysis

Expected time: Two hours

As you may recall, heaps provide a relatively efficient implementation of priority queues. A heap is a form of binary tree. Heaps must be nearly complete. That is, all rows but the last row are complete and the last row has gaps only at the right). Heaps must also have the heap property. That is, the value stored in a node must be smaller than or equal to the values stored in its descendants. Near-completeness guarantees that the depth is in O(log2(n)). The heap property guarantees that we can easily find the smallest value (after all, it's at the root of the tree).

To get the smallest value from the heap, we remove the value at the root (constant time), move the last value in the last row to the root (constant time), and then swap down, moving that value to its appropriate place in the heap (time proportional to depth).

To add a value to the heap, we put it at the the end of the last row (constant time) and then swap up moving that value to its appropriate place in the heap (time proportional to depth).

To make it easy to find the end of the last row, we store the heap in a Vector. Some simple math suggests that the index of the left child of the node at index i is 2*i+1 and the index of the right child is 2*i+2. Similarly, the index of the parent of the node at position i is the floor of (i-1)/2.

I've begun to implement the Heap data structure.

package rebelsky.exam3;

import java.util.Comparator;
import java.util.Vector;

/**
 * A simple (and currently incomplete) implementation of heaps.
 *
 * @author Samuel A. Rebelsky
 * @author YourNameHere
 */
public class Heap<T>
  implements PriorityQueue<T>
{
  // +--------+--------------------------------------------------
  // | Fields |
  // +--------+

  /**
   * The underlying vector that stores all the values.
   */
  Vector<T> contents;

  /**
   * The comparator used to determine ordering.
   */
  Comparator<T> order;


  // +--------------+--------------------------------------------
  // | Constructors |
  // +--------------+

  /**
   * Build a new heap that uses a particular comparator to
   * determine ordering.
   */
  public Heap(Comparator<T> _order)
  {
    this.contents = new Vector<T>();
    this.order = _order;
  } // Heap(Comparator<T>)


  // +----------------+------------------------------------------
  // | Public Methods |
  // +----------------+

  public void put(T element)
    throws Exception
  {
    // Put the new element at the end of the last row of the heap.
    this.contents.add(element);
    // Swap it up through the heap until it's in the correct place.
    this.swapUp(this.size()-1);
  } // put(T)

  /**
   * Remove the smallest value from the heap.
   */
  public T get()
    throws Exception
  {
    // Remember what is at the root of the heap.
    T smallest = this.contents.get(0);
    // Put the last value at the root
    T tmp = this.contents.remove(this.size()-1);
    if (this.size() > 0) {
      this.contents.set(0,tmp);
      // Swap it down until its in the correct place.
      this.swapDown(0);
    }
    // Return what was at the root
    return smallest;
  } // get()

  public boolean isEmpty()
  {
    return (this.size() == 0);
  } // isEmpty()

  public boolean isFull()
  {
    return false;
  } // isFull()

  public int size()
  {
    return this.contents.size();
  } // size()

  // +----------------+------------------------------------------
  // | Helper Methods |
  // +----------------+

  /**
   * Get the index of the left child.
   */
  int left(int pos)
  {
    return 2*pos + 1;
  } // left(int)

  /**
   * Get the index of the right child.
   */
  int right(int pos)
  {
    return 2*pos + 2;
  } // right(int)

  /**
   * Get the index of the parent.
   */
  int parent(int pos)
  {
    if ((pos % 2) == 0) {
      return (pos-2)/2;
    }
    else {
      return (pos-1)/2;
    }
  } // parent(int)

  /**
   * Swap the value at position pos down through the heap until it
   * reaches the correct position.
   */
  void swapDown(int pos)
  {
    // STUB
  } // swapDown(int)

  /**
   * Swap the value at position pos up through the heap until it
   * reaches the correct position.
   */
  void swapUp(int pos)
  {
    // STUB
  } // swapUp(int)

} // class Heap<T>

a. As you can see, I have neglected to implement the swapUp and swapDown operations. Implement those operations. [10 points; one hour]

b. In class, I claimed that we could use heaps to implement a relatively fast sorting strategy. To sort a group of values, shove them all into the heap and then take them out in order. Implement that strategy. That is, write a method that takes a Vector as input, puts all the values in the Vector into a heap, and then puts the values from the heap back into the Vector. [10 points; 30 minutes]

c. I also claimed that this strategy for sorting should take time proportional to O(nlog2n). Gather experimental data to verify or refute that claim. [5 points; 30 minutes]

Problem 4: Deletion in Linear-Probe Hash Tables

Topics: Hash Tables, Vectors, Data Structure Design

Expected time: One hour

As we discussed in our exploration of hash tables, there are two strategies we can use to handle conflicts in hash tables: We can put multiple values in each cell (the bucket strategy) or we can provide some spare space in the vector and shift values into neighboring cells (the linear-probe strategy).

Let us consider how the linear-probe strategy works for the put operation. Suppose we're using integers as keys and our hash table has size ten. Suppose also that our hash function is use the last digit.

We begin with an empty vector.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |  |  |  |  |  |
+--+--+--+--+--+--+--+--+--+--+

If we hash 55, it goes into cell 5.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |55|  |  |  |  |
+--+--+--+--+--+--+--+--+--+--+

If we hash 17, it goes into cell 7.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |55|  |17|  |  |
+--+--+--+--+--+--+--+--+--+--+

If we now hash 25, we first try to put it into cell 5. That cell is full, so we try cell 6. That cell is empty, so we put it there.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |55|25|17|  |  |
+--+--+--+--+--+--+--+--+--+--+

If we now hash 35, we first try to put it into cell 5 (35 is 5 mod 10). That cell is full, so we try cell 6. That cell is also full, so we try cell 7. That cell is also full, so we try cell 8. That cell is, fortunately, empty, so we put it there.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |55|25|17|35|  |
+--+--+--+--+--+--+--+--+--+--+

If we hash 49, we first try to put it into cell 9 (49 is 9 mod 10). That cell is empty, so we put it there.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |  |  |55|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

If we hash 53, we first try to put it into cell 3. That cell is empty, so we put it there.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|  |  |  |53|  |55|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

Suppose we next try to hash 68. We first try cell 8. That's full with 35. We next try cell 9. That's full with 49. We must then wrap around to the beginning (cell 0). That cell is empty, so we put it there.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|  |  |53|  |55|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

Suppose we next try to hash 70. We first try cell 0. That's full with 68. We next try 1. That's empty, so we put it there.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|70|  |53|  |55|25|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

You can watch this construction live (but slightly less readably) by making a copy of LinearProbeHashTable and TestLPHT1, updating them for your package, compiling them, and then executing the command

ji username.exam3.TestLPHT1 55 17 25 35 49 53 68 70

How do we find something in a linear-probe hash table? We start looking at the cell corresponding to the modified hash value and then step through neighboring cells until (a) we find the key, in which case we're done, or (b) we hit an empty space, in which case we report that the key/value pair is not in the hash table.

For example, to find 35, we first look in cell 5. 35 is not there, so we try cell 6. 35 is not there, so we try cell 7. 35 is not there, so we try cell 8. 35 is there, so we're done.

Similarly, to find 100, we first look in cell 0. 100 is not there, so we try cell 1. 100 is not there, so we try cell 2. Cell 2 is empty, so we note that the key of 100 is not in the hash table.

To find 99, we first look in cell 9. 99 is not there, so we wrap around and try cell 0. 99 is not there, so we try cell 1. 99 is not there, so we try cell 2. Cell 2 is empty, so we fail.

Now that you understand the basics of linear-probe hash tables, it's time for you to consider a key issue: removing elements. Removal is not a trivial task (which is one of the reasons many people don't like to implement linear-probe hash tables). Consider what happens if we decide to remove 25 from the table above.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|70|  |53|  |55|**|17|35|49|
+--+--+--+--+--+--+--+--+--+--+

We can't leave cell 6 empty, because that would make a search for 35 fail, even though it's in the table. Hence, we have to shift the 35 into the space left by the 25. (We should not shift the 17, since it's in the right place.)

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|68|70|  |53|  |55|35|17|**|49|
+--+--+--+--+--+--+--+--+--+--+

But we can't leave cell 8 empty, because we won't find the 68. Hence, we have to shift it into that empty space. (We can't shift the 49, because it's in the right place.)

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|**|70|  |53|  |55|35|17|68|49|
+--+--+--+--+--+--+--+--+--+--+

Of course, we can't leave cell 0 empty, because then we would fail to find the 70. We shift that 70.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|70|**|  |53|  |55|35|17|68|49|
+--+--+--+--+--+--+--+--+--+--+

That removal was safe, so we're done.

  0  1  2  3  4  5  6  7  8  9  
+--+--+--+--+--+--+--+--+--+--+
|70|  |  |53|  |55|35|17|68|49|
+--+--+--+--+--+--+--+--+--+--+

Describe in detailed psuedocode how one removes a key/value pair from a linear-probe hash table. You may also choose to implement the remove operation in Java, but certainly need not do so.

Warning: It is not acceptable to say re-hash everything in the hash table or a variant thereof.

Warning: You may not change the behavior of insert or find as you implement remove. In particular, you may not put a placeholder there that find says is not a match and insert says is empty.

Some Questions and Answers

These are some of the questions students have asked about the exam and my answers to those questions.

For part 2b, I'm not sure exactly what you're looking for. Could you be more specific?
I want to see you implement sets. The underlying structure should be a tree structured so that smaller values are to the left of a node and larger values are to the right.
Can we copy and modify your code?
Yes, provided you follow guidelines for correct citation.
Where will I find that code?
In the dictionary code in The TAO of Java.
Also, for 3c, when you say "this strategy" do you mean the speed of a) the overall sorting of the list, b) the get and/or put methods individually, c) the swapUp and/or swapDown methods themselves, or d) something else I didn't think of? I know I need to count some steps, but I'm not sure where it would be useful to put a counter to get the most accurate evidence.
I want to know if the sorting method is really O(nlog2n) in practice.
Is there an elegant way to get the depth of a tree?
If you're getting depth simply for analysis, the easiest way to do it is with a recursive traversal of the tree. If you regularly need the depth, then you should have a depth field that you update whenever you insert or remove an element.

Sample Recurrence Relations

There was a request for some examples of the solution of recurrence relations. Here are a few.

i.

Repeatedly expand

Look for a pattern

Let k = n

We know that the middle sum is in O(n2) and that dominates the initial n and the constant d, so

ii.

Repeatedly expand

Look for a pattern

Let k = log2n

iii.

Repeatedly expand

Look for a pattern

Let k = log2n

Therefore

iv.

Repeatedly expand

Look for a pattern

Let k = n

Therefore

v.

Repeatedly expand

Look for a pattern

Let k = log2n

Therefore

Errors

Here you will find errors of spelling, grammar, and design that students have noted. Remember, each error found corresponds to one point of extra credit for everyone. I limit such extra credit to five points. After the first five points, each five errors correspond to one additional point of extra credit.

Six points of extra credit.

 

History

Tuesday, 22 November 2005 [Samuel A. Rebelsky]

Wednesday, 23 November 2005 [Samuel A. Rebelsky]

Monday, 28 November 2005 [Samuel A. Rebelsky]

Wednesday, 30 November 2005 [Samuel A. Rebelsky]

Thursday, 1 December 2005 [Samuel A. Rebelsky]

Friday, 2 December 2005 [Samuel A. Rebelsky]

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue Dec 6 09:46:52 2005.
The source to the document was last modified on Fri Dec 2 08:18:49 2005.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2005F/Exams/exam.03.html.

You may wish to validate this document's HTML ; Valid CSS! ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu