Fundamentals of Computer Science 1 (CS151 2003S)

Strings

Summary: A string is a sequence of characters. Unlike symbols, which are atomic, strings can be separated into constituent parts.

Related readings:

Procedures covered in this reading (Warning! Most links are broken.):

Contents:

String Basics

A string is a sequence of zero or more characters. Most strings can be named by enclosing the characters they contain between plain double quotation marks, to produce a string literal: for instance, "hyperbola" is the nine-character string consisting of the characters #\h, #\y, #\p, #\e, #\r, #\b, #\o, #\l, and #\a, in that order, and "" is the zero-character string (the null string).

String literals may contain spaces and newline characters; when such characters are between double quotation marks, they are treated like any other characters in the string. There is a slight problem when one wants to put a double quotation mark into a string literal: To indicate that the double quotation mark is part of the string (rather than marking the end of the string), one must place a backslash character immediately in front of it. For instance, "Say \"hi\"" is the eight-character string consisting of the characters #\S, #\a, #\y, #\space, #\", #\h, #\i, and #\", in that order. The backslash before a double quotation mark in a string literal is an escape character, present only to indicate that the character immediately following it is part of the string.

This use of the backslash character causes yet another slight problem: What if one wants to put a backslash into a string? The solution is to place another backslash character immediately in front of it. For instance, "a\\b" is the three-character string consisting of the characters #\a, #\\, and #\b, in that order. The first backslash in the string literal is an escape, and the second is the character that it protects, the one that is part of the string.

String Procedures

Scheme provides several basic procedures for working with strings:

The (string? val) predicate determines whether its argument is or is not a string.

The (make-string count char) procedure constructs and returns a string that consists of repetitions of a single character. Its first argument indicates how long the string should be, and the second argument specifies which character it should be made of. For instance,

> (make-string 5 #\a)
"aaaaa"

constructs and returns the string "aaaaa".

The (string ch1 ... chn) procedure takes any number of characters as arguments and constructs and returns a string consisting of exactly those characters. For instance, (string #\H #\i #\!) constructs and returns the string "Hi!".

The string->list and list->string procedures do just what you'd expect.

The string-length procedure takes any string as argument and returns the number of characters in that string. For instance, the value of (string-length "parabola") is 8 and the value of (string-length "a\\b") is 3.

The string-ref procedure is used to select the character at a specified position within a string. Like list-ref, string-ref presupposes zero-based indexing; the position is specified by the number of characters that precede it in the string. (So the first character in the string is at position 0, the second at position 1, and so on.) For instance, the value of (string-ref "ellipse" 4) is #\p -- the character that follows four other characters and so is at position 4 in zero-based indexing.

Strings can be compared for lexicographic order, the extension of alphabetical order that is derived from the collating sequence of the local character set. Once more, Scheme provides both case-sensitive and case-insensitive versions of these predicates: string<?, string<=?, string=?, string>=?, and string>? are the case-sensitive versions, and string-ci<?, string-ci<=?, string-ci=?, string-ci>=?, and string-ci>? the case-insensitive ones.

The (substring str start end) procedure takes three arguments. The first is a string and the second and third are non-negative integers not exceeding the length of that string. Substring returns the part of its first argument that starts after the number of characters specified by the second argument and ends after the number of characters specified by the third argument. For instance: (substring "hypocycloid" 3 8) returns the substring "ocycl" -- the substring that starts after the initial "hyp" and ends after the eighth character, the l.

The string-append procedure takes any number of strings as arguments and returns a string formed by concatenating those arguments. For instance, the value of (string-append "al" "fal" "fa") is "alfalfa".

 

History

8 October 1997 [John Stone]

2 October 2000 [Sam Rebelsky]

4 February 2001 [Samuel A. Rebelsky]

7 February 2001 [Samuel A. Rebelsky]

Monday, 26 February 2001 [Samuel A. Rebelsky]

Sunday, 15 September 2002 [Samuel A. Rebelsky]

Monday, 16 September 2002 [Samuel A. Rebelsky]

Tuesday, 4 February 2003 [Samuel A. Rebelsky]

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue May 6 09:31:12 2003.
The source to the document was last modified on Tue Feb 4 23:08:48 2003.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS151/2003S/Readings/strings.html.

You may wish to validate this document's HTML ; Valid CSS! ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu