Exam 2: C Tools and Techniques

Assigned: Wednesday, 27 October 2010

Due: 11 p.m., Wednesday, 3 November 2010

Preliminaries

Exam Format

This is a take-home examination. You may use any time or times you deem appropriate to complete the exam, provided you return it to me by the due date.

There are five problems on this examination. They are not necessarily of equal difficulty. Each problem may include subproblems. Five correct or mostly-correct solutions will earn you an A. Four correct or mostly-correct solutions will earn you a B. Three correct or mostly-correct solutions will earn you a C. Two correct or mostly-correct solutions will earn you a D. One or Zero correct or mostly-correct solutions will earn you an F. Failure to attempt the exam will earn you a 0. Partially-correct solutions may or may not earn you a partial grade, at the discretion of the grader.

Read the entire exam before you begin.

I expect that someone who has mastered the material and works at a moderate rate should have little trouble completing the exam in a reasonable amount of time. In particular, this exam is likely to take you about five hours, depending on how well you've learned the topics and how fast you work. You should not work more than eight hours on this exam. Eight hours of work, documented attempts on all five problems, and a signed statement that There's more to life than computer science will earn you at least a C on this exam.

I would also appreciate it if you would write down the amount of time each problem takes.

Academic Honesty

This examination is open book, open notes, open mind, open computer, open Web. However, it is closed person. That means you should not talk to other people about the exam. Other than as restricted by that limitation, you should feel free to use all reasonable resources available to you.

As always, you are expected to turn in your own work. If you find ideas in a book or on the Web, be sure to cite them appropriately. If you use code that you wrote for a previous lab or homework, cite that lab or homework and the other members of your group. If you use code that you found on the course Web site, be sure to cite that code. You need not cite the code provided in the body of the examination.

Although you may use the Web for this exam, you may not post your answers to this examination on the Web. And, in case it's not clear, you may not ask others (in person, via email, via IM, by posting a please help message, or in any other way) to put answers on the Web.

Because different students may be taking the exam at different times, you are not permitted to discuss the exam with anyone until after I have returned it. If you must say something about the exam, you are allowed to say This is among the hardest exams I have ever taken. If you don't start it early, you will have no chance of finishing the exam. You may also summarize these policies. You may not tell other students which problems you've finished. You may not tell other students how long you've spent on the exam.

You must include both of the following statements on the cover sheet of the examination.

I have neither received nor given inappropriate assistance on this examination.
I am not aware of any other students who have given or received inappropriate assistance on this examination.

Please sign and date each statement. Note that the statements must be true; if you are unable to sign either statement, please talk to me at your earliest convenience. You need not reveal the particulars of the dishonesty, simply that it happened. Note also that inappropriate assistance is assistance from (or to) anyone other than Professor Rebelsky (that's me).

Presenting Your Work

You must present your exam to me in two forms: both physically and electronically. That is, you must write all of your answers using the computer, print them out, number the pages, put your name on the top of every page, and hand me the printed copy. You must also email me a copy of your exam. You should create the emailed version by copying the various parts of your exam and pasting them into an email message. In both cases, you should put your answers in the same order as the problems. Failure to name and number the printed pages will lead to a penalty. Failure to turn in both versions may lead to a much worse penalty. Failure to turn in a legible version of the exam is also likely to lead to a penalty.

In many problems, I ask you to write code. Unless I specify otherwise in a problem, you should write working code and include examples that show that you've tested the code.

Unless I explicitly tell you not to document a procedure, you should assume that you must document every procedure you write using six-P style documentation. You need not document main or procedures whose primary purpose is to run unit tests.

Just as you should be careful and precise when you write code and documentation, so should you be careful and precise when you write prose. Please check your spelling and grammar. Since I should be equally careful, the whole class will receive one point of extra credit for each error in spelling or grammar you identify on this exam. I will limit that form of extra credit to five points.

I may give partial credit for partially correct answers. I am best able to give such partial credit if you include a clear set of work that shows how you derived your answer. You ensure the best possible grade for yourself by clearly indicating what part of your answer is work and what part is your final answer.

Getting Help

I may not be available at the time you take the exam. If you feel that a question is badly worded or impossible to answer, note the problem you have observed and attempt to reword the question in such a way that it is answerable. If it's a reasonable hour (before 10 p.m. and after 8 a.m.), feel free to try to call me in the office (269-4410) or at home (236-7445).

I will also reserve time at the start of each class before the exam is due to discuss any general questions you have on the exam.

Problems

Problem 01: Reverse Polish Notation, Revisited

Topics: Reverse Polish Notation, Macros, Robustness

Consider rpn.c, a sample solution to a problem from the previous exam. As you may note, there are a lot of cases that look quite similar. In particular, there are five sections of code that look almost exactly like the following.

        case '+':
          if (top <= 1)
            {
              printf ("ERROR: Stack underflow.  Skipping operation.\n");
              break;
            }
          --top;
          stack[top-1] = stack[top-1] + stack[top];
          break;

Those sections of code differ only in the operation used in the penultimate line. (That is, whether we add, subtract, multiply, divide, or mod the top two values on the stack.) Clearly, it is inappropriate to have that much duplicated code.

What should we do? In Scheme, we'd write a higher-order procedure that took the operation as a parameter. But that's not really an option in C. As we considered in our discussion of macros, we can use a macro for a case like this.

Write a macro, BINARY_STACK_OPERATION(op) that we can use to handle the repetitive code. Once that macro is available, we should be able to write the cases as

        case '+':
          BINARY_STACK_OPERATION(+)
          break;
        case '-':
          BINARY_STACK_OPERATION(-)
          break;
        case '*':
          BINARY_STACK_OPERATION(*)
          break;
        case '/':
          BINARY_STACK_OPERATION(/)
          break;
        case '%':
          BINARY_STACK_OPERATION(%)
          break;

Problem 02: In-Place Unit Tests

Topics: Unit Testing, C Preprocessor, Makefiles

In our unit testing work to date, we've put the code for the library function in one file and the code for the unit test in another file. However, we might combine them into a single file. For example, here is a file that contains both swap and a too-simplistic unit test for swap.

/**
 * swap.c - the standard C swap routine.
 */

#include <stdio.h>
#include <stdlib.h>

void
swap (int *ap, int *bp)
{
  int temp = *ap;
  *ap = *bp;
  *bp = temp;
} // swap

int errors;

void 
utest_swap ()
{
  int x = 1;
  int y = 2;

  swap (&x, &y);
  if ((x != 2) || (y != 1))
    {
      ++errors;
      printf ("Failed to correctly swap x and y.\n");
    }
 
  x = 0;
  swap (&x, &x);
  if (x != 0)
    {
      ++errors;
      printf ("Failed to correctly swap x with itself.\n");
    }
} // utest_swap

int
main ()
{
  utest_swap ();
  if (errors)
    printf ("%d errors encountered.\n", errors);

  return errors;
} // main ()

It is helpful to have the unit testing code in the same file as the program we're testing because (a) it reminds us what the tests are and (b) we have less of a proliferation of files.

Unfortunately, the C compiler doesn't like us to put a main in every file when we link them together. (It complains about multiple main routines, or something similar.) What's the solution? Conditional compilation. We put the main routine in an #ifdef block (using something sensible like UNIT_TEST) and, when we want to compile the executable test, we add a flag of -DUNIT_TEST.

a. Make this extension to the sample code. (That is, add the #ifdef.)

b. Write a Makefile that builds the executable swap.utest from swap.c using the strategy described above.

c. Here's the main function for a simple program, order3.c, that prints three numbers entered at the command line in order.

int
main (int argc, char *argv[])
{
  int a, b, c;
  if (argc != 4)
    {
      printf ("Usage: order n1 n2 n3\n");
      return EXIT_FAILURE;
    }
  a = atoi (argv[1]);
  b = atoi (argv[2]);
  c = atoi (argv[3]);

  if (a > b)
    swap (&a, &b);
  if (b > c)
    swap (&b, &c);
  if (a > b)
    swap (&a, &b);

  printf ("%d %d %d\n", a, b, c);

  return EXIT_SUCCESS;
} // main

d. Extend your Makefile to build that program by linking swap.o and order3.o.

e. The technique we used in step b (build whatever.utest from whatever.c by compiling with the UNIT_TEST flag set) can be used in other situations. Write a general Make rule to express that policy.

Problem 03: Tools for Testing Sorting Algorithms

Topics: Testing, Recursion, Permutations, Sorting, Debugging

As many of you have noted, we can be confident that array b is a sorted version of array a if

b is in order from smallest to largest and
b is a permutation of a.

We might write functions to check each of these conditions.

/**
 * Procedure:
 *   is_in_order
 * Parameters:
 *   values[], an array of integers
 *   size, an integer
 * Purpose:
 *   Determine if the first size values in values are in order.
 * Produces:
 *   in_order, an integer that represents a Boolean
 * Preconditions:
 *   0 <= size < length of values
 * Postconditions:
 *   If values[i] <= values[i+1] for all reasonable i, then
 *     in_order is true (1).
 *   Otherwise, in_order is false (0).
 */
int
is_in_order (int values[], int size)
{
  int i;
  for (i = 1; i < size; i++)
    if (values[i] >= values[i+1])
      return 0;
  return 1;
} // is_in_order

/**
 * Procedure:
 *   is_permutation
 * Parameters:
 *   a1, an array of integers
 *   a2, an array of integers
 *   size, an integer
 * Purpose:
 *   Determine if a2 is a permutation of a1.
 * Produces:
 *   is_perm, an integer that represents a Boolean
 * Preconditions:
 *   0 <= size 
 *   size <= length of a1
 *   size <= length of a2
 */
int
is_permutation (int a1[], int a2[], int size)
{
  int i;
  for (i = 1; i <= size; i++)
    {
      if (index_of (a1, size, a2[i]) == -1)
        return 0;
    } // for
  return 1;
} // is_permutation

For is_permutation we also need index_of.

/**
 * Procedure:
 *   index_of
 * Parameters:
 *   a, an array of integers
 *   limit, an integer
 *   value, an integer
 * Purpose:
 *   Finds an index of i in a.
 * Produces:
 *   index, an integer
 * Preconditions:
 *   0 <= limit <= length(a)
 * Postconditions:
 *   a has not been modified
 *   If there exists a j in [0..limit) s.t. a[j] == value, then
 *     index >= 0 and a[index] == value
 *   If no such j exists, then
 *     index == -1
 */
int
index_of (int a[], int limit, int value)
{
  int i;
  for (i = 0; i < limit; i++)
    if (a[i] == value)
      return i;
  return -1;
} // index_of

However, if there are errors in index_of, is_in_order, or is_permutation, then our tests of the sorting routine will not be reliable. How do we determine if there are errors? We write unit tests. What do we do if the unit tests fail? We use the debugger to try to figure out why they've failed.

a. Write unit tests for the three functions.

b. If your unit tests revealed errors, correct those errors.

Problem 04: Analyzing Shell Sort

Topics: Annotating Code, Shell Sort, Sorting

As you may recall, in our first exploration of DDD, we debugged an implementation of Shell Sort. A slightly modified version of that implementation follows:

void 
shell_sort (int a[], int size)
{
  int i, j;
  int h = 1;

  do 
    {
      h = h * 3 + 1;
    } 
  while (h <= size);

  do 
    {
      h /= 3;
      for (i = h; i < size; i++)
        {
          int v = a[i];
          for (j = i; j >= h && a[j - h] > v; j -= h)
            a[j] = a[j - h];
          if (i != j)
            a[j] = v;
        } // for
    } 
  while (h != 1);
} // shell_sort

As you may recall, when we study algorithms, one of the questions we ask ourselves is How much time does this algorithm take on a particular size input? As you will soon discover, one way we assess sorting algorithms is to count (a) the number of comparisons that involve elements of the array and (b) the number of assignments to elements of the array.

a. Add two counter variables, one for comparison and one for assignment, that you update each time one such operation is done.

b. Write a program that experimentally analyzes this version of Shell Sort. That is, for a variety of sizes (say 16, 32, 64, 128, and 256) you should generate a variety of arrays, sort them using Shell Sort, and determine the minimum, maximum, and average number of comparisons and assignments Shell Sort does for that size array.

Problem 05: Reordering Arrays

Topics: Sorting, Swapping, Array Operations

If you've read about Shell Sort, you've learned that Shell sort achieves better performance by doing a high level pass over the data, trying to move larger values to the right and smaller values to the left.

Let's look at a variant of that problem. Given an array of integers and a value that may represent a median or mean or some such, we want to reorganize the array so that all the values larger than that value fall at the right end of the array and all the values less than or equal to the value fall at the left of the array.

For example, if we had the array { 7, 2, 1, 9, 5, 4 }, one possible way to reorder the array using a splitter of 5 would be { 4, 2, 1, 5, 9, 7 }, with the first four values less than or equal to 5 and the remaining two values greater than 5. (This is one of many ways to reorder the array. Another is { 2, 1, 5, 4, 7, 0 }. But we'll always have four values less than or equal to 5.)

In Six-P style, we might write:

/**
 * Procedure:
 *   reorder
 * Parameters:
 *   a, an array
 *   size, an integer
 *   splitter, an integer
 * Purpose:
 *   Rearrange a so that all the values less than or equal to splitter
 *   fall at the left side of the array and all of the values greater than
 *   splitter fall at the right side of the array.
 * Produces:
 *   border, an integer
 * Preconditions:
 *   0 < size <= length of a
 * Postconditions:
 *   Values have neither been added to no removed from a.  (That is,
 *     the resulting array is a permutation of the original array.)
 *   There are border values in a that are <= splitter.
 *   0 <= border <= size.
 *   For 0 <= i < border, a[i] <= splitter.
 *   For border <= i <= size, a[i] > splitter.
 */

Implement reorder.

Note: The problem asks you to reorder the array. That means that you should not use additional arrays to help with your work. Do the rearrangement in place.

Hint: One strategy you might use is to have two indices, one that starts at the left end of the array and one that starts at the right end. You advance the left index over values that are less-than or equal to splitter. You retreat the right index over values that are greater than splitter. When you can't advance/retreat any more, you swap the values and start again.

Questions

Here you will find answers to questions of general interest. Please check here before emailing your questions!

Problem 1

On problem 1, it seems like in rpn.c you declare the variables "int i" and "char *negate", but I don't think that either of them are used. Am I missing something?: No. Those are remnants of an earlier version of the program.
How do we know if your solution is correct?: If you run the preprocessor with cc -E rpn.c, you should see similar code to the original. In addition, the program should work the same.

Problem 4

Do you want us to have the program do multiple tests and spit out the largest, or do you want us to run the program multiple times on different arrays and then report what the largest value we found was?: It seems pretty stupid to have you do work that you can make a program do for you. So, I want the program to run multiple trials.
Can we put the main method in the same file as Shell sort?: Yes
Do we have to make it all nice and elegant, with conditional compilation that determines whether or not the main method is actually compiled?: No

Problem 5

It seems to me that I can solve this problem by creating separate arrays. Is that okay?: No, that's an inefficient use of space. Your goal is to rearrange the current array without relying on additional arrays.

Miscellaneous

Regarding the first error mentioned below. Do you actually run the code you give us?: Um, no. Not usually.

Errata

Here you will find errors of spelling, grammar, and design that students have noted. Remember, each error found corresponds to a point of extra credit for everyone. We usually limit such extra credit to five points. However, if I make an astoundingly large number of errors, then I will provide more extra credit.

In problem 2c, there's a semicolon missing after the first printf statement. [SD, 1 point]
Questionable use of semicolon in text of problem 3. [SD, 1 point]
Inconsistent capitalization in section titles. [BB, 1 point]
Incorrect punctuation in We write unit tests. [SD, 1 point]
Extraneous variables in rpn.c. [MD, 1 point]

History

Tuesday, 26 October 2010 [Samuel A. Rebelsky]

Created.

Friday, 29 October 2010 [Samuel A. Rebelsky]

Corrected two errors.

Monday, 1 November 2010 [Samuel A. Rebelsky]

Corrected another error or two.

Tuesday, 2 November 2010 [Samuel A. Rebelsky]

Removed extraneous variables from rpn.c.
Added a few more questions and answers.