In an earlier assignment, you explored the computation of a Dale-Chall score that measures the complexity of a document. The Dale-Chall score requires a count of words in the document, a count of easy words, a count of difficult words, and a number of sentences. The Dale-Chall score also provides an expected grade level.
Write a procedure, (add-dc-info infile outfile), that takes two file names as parameters, reads an HTML file from the infile, adds some preliminary information to the start of the body, and writes the result to outfile.
The preliminary information should look something like the following:
Number of words: ### (### easy, ### difficult)
Number of sentences: ###
Given a score of ###, this document is appropriate for a LEVEL audience.
You will likely want to
As you may recall, a well-designed Web pages uses section headings of the form h1, h2, h3, and so on and so forth.
In your experience with print culture, you likely learned that printed documents often have something called a “table of contents”, a list of the chapters, sections, and subsections, along with their page numbers.
On the Web, we often prefer “active” tables of contents.
That is, each entry in the table of contents has a link to the appropriate section of the document.
Write a procedure, (add-toc infile outfile), that takes two file names as parameters, reads an HTML file from the infile, adds a table of contents at the start of the body, and writes the result to outfile.
For example, if our document has the following form,
<h1>My awesome document</h1>
...
<h2>Defining awesomeness</h2>
<p>Of course, if we call something <q>awesome</q>, we should
identify what we mean by that word.</p>
...
<h3>What is awe?</h3>
...
<h3>Summing Values</h3>
...
<h2>On the role of Racket in awesomeness</h2>
...
<h2 id="conclusion">Conclusion</h2>
...
Our revised document will have a form something like the following.
<div class="toc">
<ul>
<li><a href="#element425431">My awesome document</a></li>
<ul>
<li><a href="#element247523">Defining awesmeness</a></li>
<ul>
<li><a href="#element608455">What is awe?</a></li>
<li><a href="#element529052">Summing Values</a></l>
</ul>
<li><a href="#element912653">On the role of Racket in awesomeness</a></li>
<li><a href="#conclusion">Conclusion</a>
</ul>
</ul>
</div>
<h1 id="element425431">My awesome document</h1>
...
<h2 id="element247523">Defining awesomeness</h2>
...
<h3 id="element608455">What is awe?</h3>
...
<h3 id="element529052">Summing Values</h3>
...
<h2 id="element921653">On the role of Racket in awesomeness</h2>
...
<h2 id="conclusion">Conclusion</h2>
...
In case it’s not clear from the example, you’ll likely have to do the following steps.
h1, h2, or h3 element that lacks an id.
We would suggest adding a random id, created with the word “element”
and a random six-digit number.h1, h2, and h3 elements and turn them into a
list with links.As you should know, Web page authors have a responsibility to make their pages accessibile to those with physical diferences. Everyone has a moral responsibility to consider others; some organizations, such as educational institutions, also have a legal responsibility. The most basic issues to consider relate to making pages more accessible to the blind, those with limited vision, or those who rely on screen readers for a variety of reasons.
What kinds of things should a page author attend to? Here are three basic ones.
img) should have a nontrivial set of “alt text” (given by the alt attribute).<a href="..."> tag) should have “meaningful” text. At minimum, meaninful should mean, “more than one word” but not “click here”.Write a procedure, (assess-accessibility url) that reads a page from the given URL and creates an SXML structure that reports on any accessibility flaws in the page (of the forms given above).
You may use (fetch-page url) to grab a page from the Web.
This problem is intended as satire. It is not intended to favor any particular political perspective or party.
As you likely know, there are a host of linguistic differences across the US. For example, most of the US calls a “frappe” a “milkshake”; a certain kind of standwich might be called a “sub”, a “grinder”, or a “poorboy”; and carbonated non-alcoholic beverages are called “soft drinks”, “soda”, “pop”, or even “Coke”.
The residents of Middle Slobovia have some particularly strange differences. For example,
Write a procedure, (slobovicize sxml), that takes as input an SXML structure and returns the structure in the Middle Slobovian dialect.
For example, consider the following structure.
(define press-release
'(div (@ (class "release"))
(ul
(li "Dateline " (span (@ (class "date")) "2021-11-01"))
(li "Location " (span (@ (class "college") "Grinnell"))))
(p "This week, President Anne Harris of "
(span (@class "college") "Grinnell College")
", a midwest liberal-arts college known for its progressive values,"
" announced today that all students must receive the vaccine"
" for the corona virus.")
(p "Professor and noted scientist Lindsey Smith, commenting"
" on President Harris' announcement said, "
(q "We want to keep our students alive to vote for"
" Democrats in the next election."))
(p "In response to President Harris' announcement, the Republican"
" Iowa Legislature voted to require "
(span (@ (class "medicine")) "horse parasite tablets")
" of all students at state schools, at the urging of"
" Senator Iam Old, who cited the clear science for"
" such medicine.")))
After slobavicize-ing that text, we would end up with something like the following.
'(div (@ (class "release"))
(ul
(li "Dateline " (span (@ (class "date")) "2021-11-01"))
(li "Location " (span (@ (class "college") "Grinnell"))))
(p "This week, Supreme Leader Anne Harris of "
(span (@class "college") "Grinnell Cult")
", a heartland Marxist-arts cult known for its Anti-American values,"
" announced today that all students must receive the tracing serum"
" for the corona myth.")
(p "Indoctrinator and noted liar Lindsey Smith, commenting"
" on Supreme Leader Harris' announcement said, "
(q "We want to keep our cultists alive to vote for"
" Traitors in the next election."))
(p "In response to Supreme Leader Harris' announcement, the Authoritarian"
" Iowa Legislature voted to require "
(span (@ (class "medicine")) "horse parasite tablets")
" of all cultists at state schools, at the urging of"
" Senator Iam Old, who cited the clear lies for"
" such medicine."))
Note: If you like to explore real linguistic differences, The New York Times has a fun intertactive quiz about the issue.
Choose one other thing to do to/with existing Web pages. You might explore a more significant version of one of the exercises above. You might do something more like the Mad Libs exercise you did in a recent mini-project. You might do something a bit more useful. Check with your instructor or class mentors if you’d like some ideas.
Please name your file web.rkt and turn it and any other files you used in on Gradescope.
[ ] Code file has appropriate headers.
[ ] Code file runs.
[ ] Part one: `add-dc-info` reads the input file, adds something, and writes to the output file.
[ ] Part two: `add-toc` reads the input file, adds something, and writes to the output file.
[ ] Part three: `fetch-page` generates SXML.
[ ] Part four: `slobovicize` works with strings.
[ ] Part five: Includes something.
[ ] All primary procedures documented.
[ ] Part one: `add-dc-info` reads the input file, adds the correct text in most cases, and writes to the output file.
[ ] Part two: `add-toc` reads the input file, adds something based on each header, and writes to the output file.
[ ] Part three: `fetch-page` generates SXML that reports counts of issues; numbers need not be exactly correct.
[ ] Part four: `slobovicize` works with SXML structures.
[ ] Part four: Examples of `slobovicize` provided.
[ ] All procedures documented, including helpers.
[ ] Part two: `add-toc` adds id attributes to any header that lacks them.
[ ] Part two: `add-toc` adds internal links to the headers.
[ ] Part three: `fetch-page` generates SXML that reports both counts of issues and individual issues; numbers are correct.
[ ] Part four: `slobovicize` deals appropriately with subwords (e.g., it should not change professorial to indoctrinatorial, but it should change professors to indoctrinators).
[ ] Part five: Uses recursion.