Fundamentals of Computer Science 1 (CS151 2003S)

SamR's Quick HTML Reference

Since I teach a variety of people about HTML, I find it appropriate to keep a simple reference to HTML handy, as much of the HTML documentation is either unwieldy or outdated. For example, the HTML 3.0 documentation runs over 190 pages, and many of the traditional HTML references available at the time that I first wrote this (ca. 1995) included deprecated elements, such as <menu>. These days, there are now good references, but I still find it convenient to have my own.

Unfortunately, the time pressures of academic life have not given me sufficient opportunity to flesh-out all of this document (e.g., HTML's relationship to SGML). Nonetheless, both my students and I find it of some use, particularly in the electronic form.

Introduction

HTML, the hypertext markup language is a common language for building hypertext documents for the World-Wide Web. Originally, authors had to build their documents in "raw HTML" because no tools were available. Now that such tools are available, many people no longer directly write HTML. Nonetheless, there a good reasons to learn HTML, particularly because it helps you understand what is and is not possible on the web.

More recently, other markup languages have been developed and extensions to HTML have been added. The extensions are often platform specific. The other languages are often standards and provide more facilities. One key successor to HTML is XML (extensible markup language). HTML has also been extended with CSS (cascading style sheets) to give page authors more control over appearance. Neither additional languages nor extensions are covered in this document.

Note that HTML is a markup language, not a programming language. What's the difference? A markup language indicates information about the structure or purpose of pieces of information; a programming language indicates information about the execution of a process (more or less).

HTML Tags

In HTML, as in most markup languages, a page author marks up the document, indicating the roles of various parts of the page. One might indicate that something is the title of the document, the beginning of a section, an item in a list, and so on and so forth.

In HTML, textual elements are traditionally surrounded by tags, although there are some tags that act as text elements. A piece of marked-up text looks something like

<TAG>some text</TAG>

Note that the TAG indicates something about the text. For example, the TAG might be P for paragraph or em for emphasized.

For example, one might indicate the title of a document with

<title>SamR's Quick HTML Reference</title>

Attributes

In addition, certain tags may have attributes (additional characteristics). For example, in Netscape's version of HTML, items in a list may indicate the type of mark that accompanies the item. In such cases, a piece of marked-up text looks something like

<tag attribute=value>some text</tag>

For example, one might describe a table with a larger-than-normal border with

<table border=10> ... </table>

Logical and physical markup

Note that there are two basic kinds of markup: logical markup, in which one describes the roles of pieces of text (e.g., this is a section heading) and physical markup, in which one describes the appearance of text (e.g., this is times, twelve point, bold, centered). Logical markup supports better information retrieval and permits readers to select appearances they find most appropriate. HTML provides a mixture of logical and physical markup tags, with some bias towards logical markup. More recently, HTML has added style sheets, which provides both a way to define your own logical elements and a way to assign a physical appearance to each logical element.

Structure of HTML documents

Each HTML document is broken into two pieces:

As one might expect, the head is surrounded by <head> and </head> tags, and the body is surrounded by <body> and </body> tags. Netscape extends the body tag with a background attribute. I feel that this makes documents unreadable, but your mileage may vary.

In addition, the whole document should be surrounded by <html> and </html> tags.

A basic HTML document might therefore appear as follows:

<html>
<head>
<title>A basic HTML document</title>
</head>
<body>
<p>
This is the only line in the document.
</p>
</body>
</html>

Components of the head

There are a number of different things you can put in the head of the document. You'll find that many documents include only a title. For now, all that you really need to understand is the title tag.

Components of the body

Paragraphs in HTML

Section Headings

HTML uses a hierarchical heading system, with labels from <h1> to <h6> Much of the documentation suggests that you only use them in order. For example, you should always begin your documents with an h1 tag and you should never use an h3 tag without a surrounding h2 tag.

Text Styles

Graphic elements

Links and anchors

So far, the tags we've seen have described parts of almost any text, whether on the Web or in printed form. Now let us consider the things that make hypertext hypertext: the links from page to page, and the anchors that let us link to particular parts of a page.

Lists in HTML

Tables in HTML

While HTML tables were originally developed to support better presentation of tabular data (that is, data that you'd like to organize into columns and rows), tables are often used to provide more precise layout. When you write HTML for me, you should only use tables for presenting tabular data and not for layout.

Miscellaneous

Preparing a Web Site or Page

You know how to make an HTML document. So, how do you get it on the Web? It all depends on your system and your Web server. I'm a Unix (and Linux) person, so my answers will be biased towards typically Unix installations.

First, you must create a Web directory. Typically, this directory is called public_html and must be in your home directory. The directory must be appropriately accessible. On Unix systems, this means that the directory must be readable and executable.

Next, you put the file in that directory. If you're working on a machine that shares a filesystem with the Web server, you can edit the file in that directory. If you're working on another machine, you'll need to transfer the file (typically with an ftp program). Make sure the file is readable.

Finally, you should check the page in a Web browser.

Not much too it, is there?

Checking Your HTML

As you might guess, HTML has a formal specification. That means that there are HTML documents that are grammatically correct, and documents that are grammatically incorrect. (Content can also be correct or incorrect, but that's outside of the scope of this document.) While most WWW browsers are fairly nice, and will display grammatically incorrect documents, you should make it a point to write correct documents.

The World Wide Web Consortium (w3c) provides an HTML Validator, which is available at http://validator.w3.org. You should use it to check your pages. Because there are now many different versions of HTML and documents are written in many languages, the most recent incarnation of the validator requires that you include a bit of extra information at the top of your document. Here's what the start of your document should look like

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

As the Web has grown, it has become increasingly important for Web authors to support a wider variety of users. A modern Web page should support the variety of different devices people use to browse the Web (large-screen displays, cell phones, auditory browsers, text-only browsers, and much much more); the different skill sets people bring to the Web (particularly with regard to language), and even different physical abilities (e.g., those with limited sight, hearing, or mobility).

The w3c has created a set of guidelines for making your content accessible to a wide variety of users (particularly those with limited sight, hearing, or mobility). These guidelines are available at http://www.w3.org/WAI/. It is also possible to have a program check many basic usability issues. The most popular checker is Bobby, which is available at http://bobby.watchfire.com.

Index of HTML Tags

 

History

Summer 1995

1995-1999

Monday, 25 August 1999

Wednesday, 1 September 1999

Tuesday, 25 January 2000

Thursday, 24 August 2000

Friday, 25 August 2000

Monday, 22 January 2001

Wednesday, 24 January 2001

Monday, 2 September 2002

Tuesday, 21 January 2003

 

Disclaimer: I usually create these pages on the fly, which means that I rarely proofread them and they may contain bad grammar and incorrect details. It also means that I tend to update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This document was generated by Siteweaver on Tue May 6 09:30:42 2003.
The source to the document was last modified on Tue Jan 21 12:05:46 2003.
This document may be found at http://www.cs.grinnell.edu/~rebelsky/Courses/CS151/2003S/Readings/html-quick.html.

Valid HTML 4.0 ; Valid CSS! ; Check with Bobby

Samuel A. Rebelsky, rebelsky@grinnell.edu