Overview
How do we get the loudhum libraries on our personal computers?
Start DrRacket
File > Install Package
Enter “https://github.com/grinnell-cs/loudhum.git” (without the quotation marks).
Click Install* or **Update (whichever appears).
When the Close button becomes available, click it.
Could you explain #px"a[a-z]*a"
and how it differs from #px"a[a-z]+a"
?
Sure. (or at least I can try)
Three parts:
a
,[a-z]*
,a
The
a
’s match the letter a (and nothing else). So we are looking for strings that start with a and end with a.
We’ll pull apart the
[a-z]*
.[a-z]
is shorthand for a or b or c or d or e or … or z. “Lowercase letters”.
An expression followed by a star means “zero or more copies”
[a-z]*
means “0 or more lowercase letters”.
#px"a[a-z]*a"
means “sequences of characters that start with a, end with a, and have only lowercase letters in between.”
> (regexp-match* #px"a[a-z]*a" "alphabet aardvark aardwolf samr")
'("alpha" "aardva" "aa")
An expression followed by a plus means “one or more copies”
[a-z]+
means “1 or more lowercase letters”.
#px"a[a-z]+a"
means “sequences of characters that start with a, end with a, and have at least one lowercase letter in between.”
Why did we get “aardva” from “aardvark”, rather than “aa”?
In general, Racket looks for the longest string that matches the pattern.
Do we need the #px
?
Often, but not always. It’s safer to include it.
Could you explain the "\\1\\2"
replacement?
Once again, I can try. Apologies for limited creativity.
We can parenthesize parts of an expression. Sometimes for clarity. Sometimes to deal with precedence issues.
#px"ab+"
means “a followed by at least one b.”#px"(ab)+"
means “sequences of repeating abababab”
> (regexp-match* #px"ab+" "abbbba ababab")
'("abbbb" "ab" "ab" "ab")
> (regexp-match* #px"(ab)+" "abbbba ababab")
'("ab" "ababab")
We may want to refer to things from the pattern when we do a replacement. For example, I may want to replace “X and Y” with “Y and X”.
The pattern is
#px"(\\S+) and (\\S+)
> (regexp-match* #px"(\\S+) and (\\S+)" "pb and j, rock and roll, foo and bar")
'("pb and j," "rock and roll," "foo and bar")
The replacement is “\2 and \1”
> (regexp-replace* #px"(\\S+) and (\\S+)"
"pb and j, rock and roll, foo and bar"
"\\2 and \\1")
"j, and pb roll, and rock bar and foo"
Did you really want the comma?
No. Sometimes I write bad regular expressions.
What if I want aa or ee or ii or oo or uu
#px"(aa|ee|ii|oo|uu)"
How do I get the first twenty characters?
(take book-letters 20)
How do I figure out how many letters?
(length book-letters)
or(string-length book-contents)
How do get rid of letters or words or lines from a list?
(drop lst num)