Fund. CS II (CS152 2006S)

Homework 16: Approximate String Matching

Assigned: Tuesday, April 25, 2006
Due: 8:00 a.m., Wednesday, April 26, 2006

Summary: In this assignment, you will use dynamic programming to improve the running time of an approximate string matching algorithm.

Purposes: To give you more experience using dynamic programming. To introduce you to the most common approximate string matching algorithm.

Contents:

Background: Approximate String Matching

As you may recall from our previous work in this class, the problem of determining how close a string is to another string occurs in a number of situations. For example, when checking the spelling of a word that appears to misspelled, you often want to find to find the closest replacement.

In a previous examination, we developed an algorithm that finds the cost of changing one string, source, to another string, target. Here is a restatement of that algorithm, generalized to include substrings.

The cost of changing characters s to source.length-1 of source to characters t to target.length-1 of target is
* The cost of inserting appropriate characters into source, if no characters remain in source.
* The cost of deleting appropriate characters from source, if no characters remain in target.
* The cost of converting the rest of source to the rest of target, if the current character in each is equal.
* The minimum of 
  + The cost of deleting the current character from source plus the cost of converting the remainder of source to the selected portion of target
  + The cost of inserting a character into source plus the cost of converting the selected portion of source to the remainder of target.

I've implmented that algorithm in StringUtils.java.

Preparation

Code Files:

Create a new package for this laboratory entitled username.dynamic.

Copy those code files into the package and update the package for each.

The Assignment

You will note that the ec method is configured to accept a cache as a parameter, but that it does not use that cache.

a. Update it to use the cache.

b. Determine the effect using a cache has on the running time.

Submitting Your Work

Email me your answer. Make sure that the answer is in the body of the email and is not an attachment.

 

History

March 2006 [Samuel A. Rebelsky]

Friday, 21 April 2006 [Samuel A. Rebelsky]

Tuesday, 25 April 2006 [Samuel A. Rebelsky]

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue May 9 08:31:14 2006.
The source to the document was last modified on Wed Apr 26 09:13:26 2006.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS152/2006S/Homework/hw.16.html.

You may wish to validate this document's HTML ; Valid CSS! ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu