# EBoard 11: Files and Regular Expressions
_Approximate overview_
* Administrative stuff and Q&A [10ish min]
* Lab [Remaining time]
Administrative stuff
--------------------
### Introductory Notes
* I hope you had wonderful weekends!
* Yes, I remembered my hearing aids.
* Today's lab continues on Friday.
* If you can't meet with me during normal office hours, try proposing
another time (within 8am-5pm) using the Outlook "schedule a meeting"
feature. I generally keep my calendar up to date.
* Congrats to our American Football team on a win.
* NO Mentor session Wednesday; you'll be working on the SoLA.
### Upcoming activities
Events
* Mentor session Tuesday at 8pm - Going over sample SoLA questions
* Convocation Thursday at 11am.
### Upcoming work
* [Mini Project 2](../assignments/mp02) redo probably due Sunday at 10:30pm!
* To be returned soon!
* SoLA 1 starts Wednesday at 4 p.m., due Thursday at 10:30 p.m..
* No readings for Wednesday or Friday.
* Today's lab writeup is due *next Monday*.
* MP3 will be released Wednesday or Friday.
### Q&A
Can I ask questions on the sample SoLA?
> Not today. Ask on Wednesday.
How does a SoLA differ from an exam?
> On an exam, if you do poorly on a few problems, you likely do poorly
on the whole exam. And you rarely have a chance to make it up.
> On a SoLA, if you do poorly on a few problems, you still do fine on
the other problems (and are generally testing on those topics) and
have a chance to make up the other problems in a few weeks.
How do I save files into the Racket directory?
> You can just move them to the Racket directory after saving.
Lab
---
### Notes entering lab
* Don't forget to discuss working habits or anything else relevant.
### Notes during lab
Don't attempt to print out the whole file in whatever form. It
will lock your DrRacket. That's why we ask you to take parts of
those lists.
Remember that we learned ways to work with lists, including ...
* `length`
* `take`
* `drop`
* `filter`
There are at least two ways to get all the letters in the text.
* `filter` with the list of characters and the appropriate predicate.
* `rex-find-matches` with the book as a string and the appropriate
regular expression.
Defining whitespace characters
`(define rex-whitespace (rex-char-set (string #\space #\tab #\newline)))`
### Questions and comments at the end of class ...
Don't forget that Sam adds notes to the eboard during class. You might
see them on the screen, but you can also check the online eboard.
Why doesn't Sam like the following?
```
(define book-characters (file->chars "pg1260.txt"))
(define first-20-characters (take (file->chars "pg1260.txt") 20))
```
> We could use ` book-characters` in the `take`.
> It turns out that `file->chars` is slow; try not to do it more than
once for any particular file in a program.
What is the first character of most files downloaded from Project
Gutenberg?
> This ugly #uFEFF, which is an invisible character that signifies
something about the text file. It signifies that it's UTF-8,
rather than ASCII. (Don't worry about the difference.)
How did you find all the letters?
* Using `filter`: `(length (filter char-alphabetic? book-characters))`
* Using `rex-find-matches`: `(length (rex-find-matches (rex-any-of (rex-char-range #\a #\z) (rex-char-range #\A #\Z)) book-contents))`