Fundamentals of Computer Science 1 (CS151 2003S)
[Skip to Body]
Primary:
[Front Door]
[Current]
[Glance]

[EC]
[Honesty]
[Instructions]
[Links]
[Search]
[Syllabus]
Groupings:
[EBoards]
[Examples]
[Exams]
[Handouts]
[Homework]
[Labs]
[Lab Writeups]
[Outlines]
[Project]
[Readings]
[Reference]
ECA:
[About]
[Grades]
[Quizzes]
[Submit Work]
[Change Password]
[Reset Password]
Misc:
[Scheme Reference]
[Scheme Report]
[CS151 2003S Gum]
[CS151 2002F]
[CS151 History]
[SamR]
Summary: We consider a typical problem of computing and a variety of algorithms used to solve that problem.
Contents:
To search a data structure is to examine its elements singly
until one has either found an element that has a desired property or
concluded that the data structure contains no such element. For instance,
one might search a vector of integers for an even element, or a
vector of pairs for a pair having the string "elephant"
as
its cdr.
In a linear data structure  such as a flat list, a vector, or a file  there is an obvious algorithm for conducting a search: Start at the beginning of the data structure and traverse it, testing each element. Eventually one will either find an element that has the desired property or reach the end of the structure without finding such an element, thus conclusively proving that there is no such element. Here are a few versions of the algorithm.
;;; Procedure: ;;; sequentialsearchlist ;;; Parameters: ;;; pred?, predicate ;;; lst, a list ;;; Purpose: ;;; Searches the list for a value that matches the predicate. ;;; Produces: ;;; match, a value ;;; Preconditions: ;;; pred? can be applied to all values in lst. ;;; Postconditions: ;;; If lst contains an element for which pred? holds, match ;;; is one such value. ;;; If lst contains no elements for which pred? holds, match ;;; is false (#f). (define sequentialsearchlist (lambda (pred? lst) (cond ; If the list is empty, no values match the predicate. ((null? lst) #f) ; If the predicate holds on the first value, use that one. ((pred? (car lst)) (car lst)) ; Otherwise, look at the rest of the list (else (sequentialsearchlist pred? (cdr lst)))))) ;;; Procedure: ;;; sequentialsearchvector ;;; Parameters: ;;; pred?, predicate ;;; vec, a vector ;;; Purpose: ;;; Searches the vector for a value that matches the predicate. ;;; Produces: ;;; match, a value ;;; Preconditions: ;;; pred? can be applied to all elements of vec. ;;; Postconditions: ;;; If vec contains an element for which pred? holds, match ;;; is the index of one such value. That is, ;;; (pred? (vectorref vec match)) holds. ;;; If vec contains no elements for which pred? holds, match ;;; is false (#f). (define sequentialsearchvector (lambda (pred? vec) ; Grab the length of the vector so that we don't have to ; keep recomputing it. (let ((len (vectorlength vec))) ; Helper: Keeps track of the position we're looking at. (let kernel ((position 0)) ; Start at position 0 (cond ; If we've run out of elements, give up. ((= position len) #f) ; If the current element matches, use it. ((pred? (vectorref vec position)) position) ; Otherwise, look in the rest of the vector. (else (kernel (+ position 1)))))))) > (define sample (vector 1 3 5 7 8 11 13)) > (sequentialsearchvector even? sample) 4 > (sequentialsearchvector (rightsection = 12) sample) #f
These search procedures return #f
if the search is
unsuccessful. The first returns the matched value if the search is
successful. The second returns returns the position in the specified
vector at which the desired element can be found. There are many variants
of this idea: One might, for instance, prefer to signal an error or display
a diagnostic message if a search is unsuccessful. In the case of a
successful search, one might simply return #t
(if all that is
needed is an indication of whether an element having the desired property
is present in or absent from the list).
One of the most common realworld
searching problems is that of
searching a collection compound values for one which matches a particular
portion of the value, known as the key. For example, we might
search a phone book for a phone number using a person's name as the key
or we might search a phone book for a person using the number as key.
As you've probably noted, association lists implement this kind of
searching if we use the first value of a list as the key for that list.
Of course, it is also possible to make a getkey
procedure
a parameter to the search procedure.
;;; Procedure: ;;; searchlistforkeyedvalue ;;; Parameters: ;;; key, a key to search for. ;;; values, a list of compound values. ;;; getkey, a procedure that extracts a key from a compound value. ;;; Purpose: ;;; Finds a member of the list that has a matching key. ;;; Produces: ;;; A matching value, if found. ;;; #f, otherwise. ;;; Preconditions: ;;; The getkey procedure can be applied to each element of values. ;;; Postconditions: ;;; If the procedure returns #f, there is no value for which ;;; (equal? key (getkey val)) ;;; holds. Otherwise, returns some value for which that holds. (define searchlistforkeyedvalue (lambda (key values getkey) (sequentialsearchlist (lambda (val) (equal? key (getkey val))) values)))
The linear search algorithms just described can be quite slow if the data
structure to be searched is large. If one has a number of searches to
carry out in the same data structure, it is often more efficient to
preprocess
the values, sorting them and transferring them to a vector,
before starting those searches. The reason is that one can then use the
much faster binary search algorithm.
Binary search is a more specialized algorithm than linear search. It requires a randomaccess structure, such as a vector, as opposed to one that offers only sequential access, such as a list. Binary search is limited to the kind of test in which one is looking for a particular value that has a unique relative position in some ordering. For instance, one could use a binary search to look for an element equal to 12 in a vector of integers, since 12 is uniquely located between integers less than 12 and integers greater than 12; but one wouldn't use binary search to look for an even integer, since the even integers don't have a unique position in any natural ordering of the integers.
The idea in a binary search is to divide the sorted vector into two approximately equal parts, examining the element at the point of division to determine which of the parts must contain the value sought. Actually, there are usually three possibilities:
(1) The element at the point of division cannot precede the value sought in the ordering that was used to sort the vector. In this case, the value sought must be in a position with a lower index that the element at the point of division (if it is present at all)  in other words, it must be in the left half of the vector. The search procedure invokes itself recursively to search just the left half of the vector.
(2) The value sought cannot precede the element at the point of division. In this case, the value sought must be in a higherindexed position  in the right half of the vector  if it is present at all. The search procedure invokes itself recursively to search just the right half of the vector.
(3) The value sought is the element at the point of division. The search has succeeded.
There is one other way in which the recursion can terminate: If, in some recursive call, the subvector to be searched (which will be half of a half of a half of ... of the original vector) contains no elements at all, then the search obviously cannot succeed and the procedure should take the appropriate failure action.
Here, then, is the basic binarysearch algorithm. The identifiers
lowerbound
and upperbound
denote the starting
and ending positions of the part of the vector within which the value
sought must lie, if it is present at all. (We use the convention that
the starting and ending positions are inclusive in that they
are positions within the vector that we must include in the search.)
;;; Procedure: ;;; binarysearch ;;; Parameters: ;;; key, a key we're looking for ;;; vec, a vector to search ;;; getkey, a procedure of one parameter that, given a data item, ;;; returns the key of a data item. ;;; lessthan?, a binary predicate that tells us whether or not ;;; one key is lessthan another. ;;; Produces: ;;; match, a number. ;;; Preconditions: ;;; The vector is "sorted". That is, ;;; (lessthan? (getkey (vectorref vec i)) ;;; (getkey (vectorref vec (+ i 1)))) ;;; holds for all reasonable i. ;;; The lessthan? procedure can be applied to all pairs of keys ;;; in the vector (and to the supplied key) ;;; The lessthan? procedure is transitive. That is, if ;;; (lessthan? a b) and (lessthan? b c) then it must ;;; be that (lessthan? a c). ;;; The lessthan? procedure is inclusive. If a is not lessthan b ;;; and b is not lessthan a, then a equals b. ;;; Postconditions: ;;; If vector contains no element whose key matches key, match is 1. ;;; If vec contains an element whose key equals key, match is the ;;; index of one such value. That is, key is ;;; (getkey (vectorref vec match)) (define binarysearch (lambda (key vec getkey lessequal?) ; Search a portion of the vector from lowerbound to upperbound (let searchportion ((lowerbound 0) (upperbound ( (vectorlength vec) 1))) (if (<= lowerbound upperbound) (let* ((midpoint (quotient (+ lowerbound upperbound) 2)) (middleelement (vectorref vec midpoint)) (middlekey (getkey middleelement))) (cond ((lessequal? key middlekey) (searchportion lowerbound ( midpoint 1))) ((lessequal? middlekey key) (searchportion (+ midpoint 1) upperbound)) (else midpoint))) 1))))
http://www.cs.grinnell.edu/~rebelsky/Courses/CS151/History/Readings/searching.html
.
[Skip to Body]
Primary:
[Front Door]
[Current]
[Glance]

[EC]
[Honesty]
[Instructions]
[Links]
[Search]
[Syllabus]
Groupings:
[EBoards]
[Examples]
[Exams]
[Handouts]
[Homework]
[Labs]
[Lab Writeups]
[Outlines]
[Project]
[Readings]
[Reference]
ECA:
[About]
[Grades]
[Quizzes]
[Submit Work]
[Change Password]
[Reset Password]
Misc:
[Scheme Reference]
[Scheme Report]
[CS151 2003S Gum]
[CS151 2002F]
[CS151 History]
[SamR]
Disclaimer:
I usually create these pages on the fly
, which means that I rarely
proofread them and they may contain bad grammar and incorrect details.
It also means that I tend to update them regularly (see the history for
more details). Feel free to contact me with any suggestions for changes.
This document was generated by
Siteweaver on Tue May 6 09:31:08 2003.
The source to the document was last modified on Tue Apr 22 14:34:49 2003.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS151/2003S/Readings/searching.html
.
You may wish to validate this document's HTML ; ; Check with Bobby
Samuel A. Rebelsky, rebelsky@grinnell.edu