Toward the end of the reading on hash tables, we explored ways to deal with tables with multiple columns. We considered two approaches, one in which we used a different table for each column, which seemed inefficient, and one in which we stored a list of phone number, user name, and address in the table, using the first name as the key.
Let’s apply this same idea to a different domain. Suppose we want to keep track of a collection of Web pages. What information might we want to store? Here’s a possible list.
If we chose to store that information in a list, we might use the following procedures to provide a bit of abstraction, so that our clients can focus on how they use the information and not on how we’ve stored the information. (That would, for example, allow us to rearrange the list, if we decided that it was appropriate to do so.)
Here’s a possible set of procedures that meet those guidelines.
;;; Procedure:
;;; web-page
;;; Parameters:
;;; identifier, a symbol
;;; url, a string
;;; title, a string
;;; author, a string
;;; date, a list of three integers
;;; Purpose:
;;; Creates a structure that stores information on a Web page.
;;; Produces:
;;; page-info, a structure that stores information on a Web page.
;;; Preconditions:
;;; [No additional]
;;; Postconditions:
;;; page-info is a structure amenable to the other web-page
;;; procedures.
(define web-page
(lambda (identifier url title author date)
(cond
[(not (symbol? identifier))
(error "web-page: invalid identifier; expected a symbol, received" identifier)]
[(not (string? url))
(error "web-page: invalid url; expected a string, received" url)]
[(not (or (regexp-match? #px"^http://" url)
(regexp-match? #px"^https://" url)))
(error "web-page: invalid url; does not begin with http or https" url)]
[(not (string? title))
(error "web-page: invalid title; expected a string, received" title)]
[(not (string? author))
(error "web-page: invalid author; expected a string, received" author)]
[(not (and (list? date) (all integer? date) (= 3 (length date))))
(error "web-page: invalid date (not three integers)" date)]
[else
(list identifier url title author date)])))
;;; Procedure:
;;; web-page-identifier
;;; Parameters:
;;; page-info, information on a Web page
;;; Purpose:
;;; Extract the identifier from a Web page
;;; Produces:
;;; identifier, a symbol
;;; Preconditions:
;;; page-info was created by the web-page procedure.
;;; Postconditions:
;;; (web-page-identifier (web-page ID URL TITLE AUTHOR DATE)) = ID
(define web-page-identifier (section list-ref <> 0))
;;; Procedure:
;;; web-page-url
;;; Parameters:
;;; page-info, information on a Web page
;;; Purpose:
;;; Extract the url from a Web page
;;; Produces:
;;; url, a string
;;; Preconditions:
;;; page-info was created by the web-page procedure.
;;; Postconditions:
;;; (web-page-url (web-page ID URL TITLE AUTHOR DATE)) = URL
(define web-page-url (section list-ref <> 1))
;;; Procedure:
;;; web-page-title
;;; Parameters:
;;; page-info, information on a Web page
;;; Purpose:
;;; Extract the title from a Web page
;;; Produces:
;;; title, a string
;;; Preconditions:
;;; page-info was created by the web-page procedure.
;;; Postconditions:
;;; (web-page-title (web-page ID URL TITLE AUTHOR DATE)) = TITLE
(define web-page-title (section list-ref <> 2))
;;; Procedure:
;;; web-page-author
;;; Parameters:
;;; page-info, information on a Web page
;;; Purpose:
;;; Extract the author from a Web page
;;; Produces:
;;; author, a string
;;; Preconditions:
;;; page-info was created by the web-page procedure.
;;; Postconditions:
;;; (web-page-author (web-page ID URL TITLE AUTHOR DATE)) = AUTHOR
(define web-page-author (section list-ref <> 3))
;;; Procedure:
;;; web-page-date
;;; Parameters:
;;; page-info, information on a Web page
;;; Purpose:
;;; Extract the date from a Web page
;;; Produces:
;;; date, a list of three integers
;;; Preconditions:
;;; page-info was created by the web-page procedure.
;;; Postconditions:
;;; (web-page-date (web-page ID URL TITLE AUTHOR DATE)) = DATE
(define web-page-date (section list-ref <> 4))
Let’s explore how these procedures work.
> (define page1
(web-page "grinnell-cs" "www.cs.grinnell.edu" "Grinnell CS Front Door" "The CS Department" '(2019 2 24)))
Error! . . web-page: invalid identifier; expected a symbol, received "grinnell-cs"
> (define page1
(web-page 'grinnell-cs "https://www.cs.grinnell.edu" "Grinnell CS Front Door" "The CS Department" "2017-02-24"))
Error! . . web-page: invalid date (not three integers) "2017-02-24"
> (define page1
(web-page 'grinnell-cs "www.cs.grinnell.edu" "Grinnell CS Front Door" "The CS Department" '(2019 2 24)))
Error! . . web-page: invalid url; does not begin with http or https "www.cs.grinnell.edu"
> (define page1
(web-page 'grinnell-cs "https://www.cs.grinnell.edu" "Grinnell CS Front Door" "The CS Department" '(2019 2 24)))
> (web-page-date page1)
'(2019 2 24)
> (web-page-url page1)
"https://www.cs.grinnell.edu"
But what if we give these procedures non-pages?
> (web-page-identifier (list "Whee"))
"Whee"
> (web-page-date (range 20))
4
That’s not very satisfying, is it? Hence, we might create a predicate for Web pages. Here’s one possibility.
;;; Procedure:
;;; web-page?
;;; Parameters:
;;; val, a Scheme value
;;; Purpose:
;;; Determine if val appears to represent a Web page
;;; Produces:
;;; is-web-page?, a Boolean value
;;; Preconditions:
;;; [No additional]
;;; Postconditions:
;;; * If val can reasonably be interpreted as a Web page
;;; (e.g., it contains an identifier, title, etc.)
;;; then is-web-page? is true (#t).
;;; * Otherwise, is-web-page? is false (#f).
(define web-page?
(lambda (val)
(and (list? val)
(= (length val) 5)
(symbol? (list-ref val 0))
(string? (list-ref val 1))
(or (regexp-match? #px"^http://" (list-ref val 1))
(regexp-match? #px"^https://" (list-ref val 1)))
(string? (list-ref val 2))
(string? (list-ref val 3))
(and (list? (list-ref val 4))
(= 3 (length (list-ref val 4)))
(all integer? (list-ref val 4))))))
We can now use this to update the various procedures so that they verify their preconditions. For example,
;;; Procedure:
;;; web-page-date
;;; Parameters:
;;; page-info, information on a Web page
;;; Purpose:
;;; Extract the date from a Web page
;;; Produces:
;;; date, a string
;;; Preconditions:
;;; page-info was created by the web-page procedure.
;;; Postconditions:
;;; (web-page-date (web-page ID URL TITLE AUTHOR DATE)) = DATE
(define web-page-date
(lambda (page-info)
(cond
[(not (web-page? page-info))
(error "web-page-date: Expected a web page, received" page-info)]
[else
(list-ref page-info 4)])))
And it makes some difference in our results.
> (web-page-identifier (list 'grinnell-cs "http://www"))
Error! . . web-page-identifier: Expected a web page, received (grinnell-cs "http://www")
> (web-page-date (range 5))
Error! . . web-page-date: Expected a web page, received (0 1 2 3 4)
> (web-page-date (list 'grinnell-cs "http://www"))
Error! . . web-page-date: Expected a web page, received (grinnell-cs "http://www")
Many Scheme programmers have used a strategy like that when they want to create collections of data. (Some use something called a “vector” instead of a list, but the approach is generally the same.)
The design of such structures is so common that it makes much more sense to develop a common way to create them. Over the years, a number of programmers have designed different libraries that make it easier to create structures.
The designers of the Racket language decided that it made more sense to include structures with named components as a set of built in operations, which we generally refer to as “structs”.
struct
sOne creates a new struct type in Racket with the struct
keyword.
At its simplest, one just provides struct
the name of the
structure and the names of its fields.
> (struct date (year month day))
Now, we can easily create one of these structures using the
automatically defined date
procedure. (More generally,
whatever name you give to your struct becomes a procedure that
creates the struct.)
> (define someday (date 2018 02 15))
> someday
#<date>
That’s interesting. It won’t let us see inside the date structure.
(Most programmers consider that a benefit. There’s no need for the
client to know how we’ve implemented our structure.) But how do
we get the fields? Using automatically-generated procedures
that get names of the form structname-fieldname
. For our date
example, those procedures are called date-year
, date-month
,
and date-day
.
> (date-year someday)
2018
> (date-month someday)
2
> (date-day someday)
15
These procedures verify that their parameter is a date (in that they
were created by the date
procedure).
> (date-day '(2018 02 15))
Error! . . date-day: contract violation
Error! expected: date?
Error! given: '(2018 2 15)
The date
procedure does some basic checking, primarily on the
number of parameters.
> (date 2018)
Error! . . date: arity mismatch;
Error! the expected number of arguments does not match the given number
Error! expected: 3
Error! given: 1
Error! arguments...:
We even get an automatically generated structure predicate,
named structname?
(date?
, in this case).
> (date? someday)
#t
> (date? (date 2000 01 01))
#t
> (date? (list 2000 01 01))
#f
struct
fieldsIn our Web page example, we were careful to verify that the individual fields had the right form (a symbol for the identifier, a string that begins with “http” or “https” for the URL, and so on and so forth). It does not appear that we’ve done so for structs. Why not? It turns out that automatic checking of such patterns is difficult.
If you care about such issues (and you might), it is often better to write your own “husks” for these underlying types.
(struct date-kernel (year month day))
(define date
(lambda (year month day)
(cond
[(not (integer? year))
(error "date: invalid year" year)]
[(or (not (integer? month))
(< month 1)
(> month 12))
(error "date: invalid month" month)]
[(or (not (integer? day))
(< day 1)
(> day 31))]
[else
(date-kernel year month day)])))
(define date?
(lambda (val)
(and (date-kernel? val)
(integer? (date-kernel-year val))
(integer? (date-kernel-month val))
(integer? (date-kernel-day val))
(<= 1 (date-kernel-month val) 12)
(<= 1 (date-kernel-day val) 31))))
(define date-year date-kernel-year)
(define date-month date-kernel-month)
(define date-day date-kernel-day)
Here are a few quick explorations of these procedures.
> (date 2019 01 31)
#<date-kernel>
> (date 2019 31 01)
. . date: invalid month 31
> (define sample (date 2019 01 31))
> (date? sample)
#t
> (date? (date-kernel 2019 31 01))
#f
> (date-year sample)
2019
> (date-month sample)
1
> (date-day sample)
31
With some careful work, we can require that our clients use the
date
husks, rather than the date-kernel
procedures. (For now,
we’ll just assume that our clients will not use the kernels.)
You may have noted that many types provide procedures that convert to
and from strings. For example, there are string->numbre
and
number->string
procedures. It can be equally useful to provide
procedures that convert from structured types to strings and
back again. Let’s consider how we might do that with our date
type.
To convert a date to a string, we need to convert each of the three
parts (year, month, and day) to a string, which we can do with
number->string
. Let’s sketch some code to do that.
> (define sample (date 1999 12 31))
> (string-append (number->string (date-year sample))
"-"
(number->string (date-month sample))
"-"
(number->string (date-day sample)))
"1999-12-31"
That looks pretty good so far. But what happens if the month or day is fewer than two digits?
> (string-append (number->string (date-year sample))
"-"
(number->string (date-month sample))
"-"
(number->string (date-day sample)))
"2000-1-1"
Whoops! It looks like we may need to pad with leading zeros. Let’s write a procedure to do that.
;;; Procedure:
;;; pad-with-zeros
;;; Parameters:
;;; str, a string
;;; n, a desired length
;;; Purpose:
;;; Puts 0's at the front of str so that the result is
;;; n characters long.
;;; Produces:
;;; padded, a string
;;; Preconditions:
;;; string-length str <= n
;;; Postconditions:
;;; * (string-length padded) = n
;;; * (regexp-match-exact? (regexp (string-append "0*" str)) padded)
(define pad-with-zeros
(lambda (str n)
(string-append (make-string (- n (string-length str)) #\0) str)))
Let’s check it out.
> (pad-with-zeros "5" 2)
"05"
> (pad-with-zeros "5" 7)
"0000005"
> (pad-with-zeros "12" 2)
"12"
We can now use this to write date->string
.
;;; Procedure:
;;; date->string
;;; Parameters:
;;; d, a date
;;; Purpose:
;;; Convert d to a string in YYYY-MM-DD format.
;;; Produces:
;;; datestring, a string
;;; Preconditions:
;;; [No additional]
;;; Postconditionds:
;;; (string->date datestring) = d
(define date->string
(lambda (d)
(string-append (pad-with-zeros (number->string (date-year d)) 4)
"-"
(pad-with-zeros (number->string (date-month d)) 2)
"-"
(pad-with-zeros (number->string (date-day d)) 2))))
Let’s see how well it works.
> (date->string (date 1999 12 31))
"1999-12-31"
> (date->string (date 2000 01 01))
"2000-01-01"
> (date->string (date 0 1 1))
"0000-01-01"
Yes, that seems to cover the bases. We may need to add a requirement that years not be negative, but that’s a task for another day.
What about the other direction? If we have a date in YYYY-MM-DD format, we can convert it back to a string by splitting at the dashes and then converting each part to a number. We won’t do precondition checking here (although we probably should).
;;; Procedure:
;;; string->date
;;; Parameters:
;;; str, a string
;;; Purpose:
;;; Convert str to a date.
;;; Produces:
;;; d, a date
;;; Preconditions:
;;; str has the form YYYY-MM-DD [unverified]
;;; Postconditions:
;;; (date->string d) = str
(define string->date
(lambda (str)
(let ([parts (string-split str "-")])
(date (string->number (list-ref parts 0))
(string->number (list-ref parts 1))
(string->number (list-ref parts 2))))))
Let’s check it out.
> (define d (string->date "2019-01-15"))
> d
#<date-kernel>
> (date-year d)
2019
> (date-month d)
1
> (date-day d)
15
> (date->string d)
"2019-01-15"
That seems reasonable. Perhaps we’ll explore it more in the lab.
Verify that the first set of date examples work as described in the reading.
Design a struct that is intended to hold a person’s name (e.g., first, middle, last, title, surname).
What do you see as the advantages of structs over lists for representing structured data types?
What do you see as the advantages of lists over structs for representing structured data types?
This reading was newly written for spring 2019. I referred to section 5 of The Racket Reference and section 5 of The Racket Guide while writing this reading.