# EBoard 11: Files and Regular Expressions _Approximate overview_ * Administrative stuff and Q&A [10ish min] * Lab [Remaining time] Administrative stuff -------------------- ### Introductory Notes * I hope you had wonderful weekends! * Yes, I remembered my hearing aids. * Today's lab continues on Friday. * If you can't meet with me during normal office hours, try proposing another time (within 8am-5pm) using the Outlook "schedule a meeting" feature. I generally keep my calendar up to date. * Congrats to our American Football team on a win. * NO Mentor session Wednesday; you'll be working on the SoLA. ### Upcoming activities Events * Mentor session Tuesday at 8pm - Going over sample SoLA questions * Convocation Thursday at 11am. ### Upcoming work * [Mini Project 2](../assignments/mp02) redo probably due Sunday at 10:30pm! * To be returned soon! * SoLA 1 starts Wednesday at 4 p.m., due Thursday at 10:30 p.m.. * No readings for Wednesday or Friday. * Today's lab writeup is due *next Monday*. * MP3 will be released Wednesday or Friday. ### Q&A Can I ask questions on the sample SoLA? > Not today. Ask on Wednesday. How does a SoLA differ from an exam? > On an exam, if you do poorly on a few problems, you likely do poorly on the whole exam. And you rarely have a chance to make it up. > On a SoLA, if you do poorly on a few problems, you still do fine on the other problems (and are generally testing on those topics) and have a chance to make up the other problems in a few weeks. How do I save files into the Racket directory? > You can just move them to the Racket directory after saving. Lab --- ### Notes entering lab * Don't forget to discuss working habits or anything else relevant. ### Notes during lab Don't attempt to print out the whole file in whatever form. It will lock your DrRacket. That's why we ask you to take parts of those lists. Remember that we learned ways to work with lists, including ... * `length` * `take` * `drop` * `filter` There are at least two ways to get all the letters in the text. * `filter` with the list of characters and the appropriate predicate. * `rex-find-matches` with the book as a string and the appropriate regular expression. Defining whitespace characters `(define rex-whitespace (rex-char-set (string #\space #\tab #\newline)))` ### Questions and comments at the end of class ... Don't forget that Sam adds notes to the eboard during class. You might see them on the screen, but you can also check the online eboard. Why doesn't Sam like the following? ``` (define book-characters (file->chars "pg1260.txt")) (define first-20-characters (take (file->chars "pg1260.txt") 20)) ``` > We could use ` book-characters` in the `take`. > It turns out that `file->chars` is slow; try not to do it more than once for any particular file in a program. What is the first character of most files downloaded from Project Gutenberg? > This ugly #uFEFF, which is an invisible character that signifies something about the text file. It signifies that it's UTF-8, rather than ASCII. (Don't worry about the difference.) How did you find all the letters? * Using `filter`: `(length (filter char-alphabetic? book-characters))` * Using `rex-find-matches`: `(length (rex-find-matches (rex-any-of (rex-char-range #\a #\z) (rex-char-range #\A #\Z)) book-contents))`