#lang racket

(require csc151)
(require racket/undefined)
(require rackunit)
(require rackunit/text-ui)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; CSC 151 (Fall 2020, Term 1)           ;;;
;;; Regular Expressions and File Handling ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;; SIDE A

#|
In this lab, we will practice writing regular expressions and
employing towards analyze texts.
|#

;;; A{{{

;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Exercise 1: Basics ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;

#|
While regular expressions are a library feature of Racket (as well as
virtually every other programming language on the planet), they are
complicated enough that we should think of them as forming their own
little programming language. As such, let's first gain some basic
fluency with the different constructs of regular expressions as
introduced in the reading.

For each of the following descriptions of string patterns, define a
regex that recognizes that pattern. In addition to the regular
expression, write a few rackunit tests (using `check-equal?`) to
verify that your regular expression works.  Make sure to test your
regular expression on both positive examples (cases where the regex
should match) and negative examples (cases where the regex should not
match).
|#

; The string starts with an "a", ends in a "b", and has any three
; characters in the middle.
(define ex-1-1-regex undefined)

; Strings that contain the substring "aaa".
(define ex-1-2-regex undefined)

; Strings that contain *either* the substring "aaa" or the substring
; "bbb".
(define ex-1-3-regex undefined)

; Strings that contain *both* the substring "aaa" and the substring
; "bbb". (*Hint*: make sure your regexp works for both "zzzaaazzzbbb"
; and "zzzbbbzzzaaa"!)
(define ex-1-4-regex undefined)

;;; }}}A

;;; A{{{

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Exercise 3: The Joys of Special Characters ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

#|
(a) Now let's tackle a potentially frustrating aspect of regular
expressions: special characters and escape sequences. First, recall
that strings are data that represent sequences of characters. We write
*string literals*, values of the string type, using quotes, *e.g.*,
`"hello world!"`

Suppose that we wanted to write a string literal that contains quotes.
For example, we want to manipulate this line of text in a Racket
program:

    They said "we should really consider getting out of here!"

In the space below, briefly describe the problem with the following
string literal.  Complete the definition of sample-text-1 below,
fixing the problem that you identified.

    (define sample-text-1
      "They said "we should really consider getting out of here!"")

<TODO: write your description of the problem here>

|#

(define sample-text-1 undefined)

#|
(b) Now suppose that we wanted to define a string that contains
backslashes, *e.g.*, the following text:

    In the LaTeX typesetting system, we italicize text with the
      \emph{...} command!

Uncomment the attempt at defining this text as sample-text-2 below and
try to run the program.  In the space below, report what happens and
if possible, what the value of `sample-text-2` is.  Describe any
errors that you encounter or if you encounter no errors, say so!

<TODO: write the reported value of sample-text-2 here>

|#

; (define sample-text-2 "In the LaTeX typesetting system, we italicize text with the \emph{...} command!")

#|
(c) You might notice that when you escape quotes, Racket reports the
escaped values in the interactions pane, *e.g.*,

    > "\"hello world\""
    "\"hello world\""

But we know that `\"` is really a single character that is a quote.
This presentation makes it a bit difficult to read what the actual
text is, *e.g.*, for debugging purposes. If we want to see the text
without these escapes, we can use the `display` function.
`(display str)` function *prints* the string to the interactions pane,
rendering all of the special characters as needed.

Consider the following string printed directly in the interactions pane and with `display`:

    > "\"hello\nworld\""
    "\"hello\nworld\""
    > (display "\"hello\nworld\"")
    "hello
    world"

In the space below, in a few sentences, explain why the two produce
different output.  In particular, what does the `\n` special character
represent?

<TODO: write your explanation here>

|#

#|
(d) Now, take `sample-text-2` and use `display` to print it without
special characters. Do you see a problem with the definition now?
Complete `sample-text-2-correct` with an updated, correct version of
the string literal.
|#

(define sample-text-2-correct undefined)

#|
(e) Escape sequences are particularly difficult to deal with in
regular expression values because slashes are *also* used to escape
special characters that appear in them!

With this in mind, try writing regular expressions that recognize the
following patterns of strings. Like before, write rackunit tests to
verify the correctness of your constructions.
|#


; Any string that starts with a left-square bracket `[` and a
; right-square bracket `]` and any amount of digits in-between.
(define ex-3-1-regex undefined)

; Any string that starts and ends with a backslash and contains digits
; in between.
(define ex-3-2-regex undefined)

;;; }}}A
