HTML Lesson Two

Table of Contents

Linking

Linking

Hypertext Anchors

An anchor is a piece of text or some other object (for example an image) which marks the beginning and/or the end of a hypertext link. The <A> element is used to mark that piece of text (or inline image), and to give its hypertextual relationship to other documents. The text between the opening and closing tags, <A attributes> ...text... </A> can be the start or destination (or both) of a link.

HREF

<A HREF="http://www.edu/st/file.html">bla bla</A>
The string `bla bla' is a hypertext link to the document `file.html' located at the indicated URL.

<A HREF="http://www.edu/st/file.html"> <IMG SRC="icon.gif"> </A> The image `icon.gif' is a hypertext link to the file "http://www.edu/st/file.html" The image acts like an icon button to the indicated HTML document.

Relative Pathnames Versus Absolute Pathnames

You can link to documents in other directories by specifying the relative path from the current document to the linked document. For example, a link to a file NYStats.html located in the subdirectory AtlanticStateswould be: <A HREF="AtlanticStates/NYStats.html">New York</A>

These are called relative links because you are specifying the path to the linked file relative to the location of the current file. You can also use the absolute pathname (the complete URL) of the file, but relative links are more efficient in accessing a server.

Pathnames use the standard UNIX syntax. The UNIX syntax for the parent directory (the directory that contains the current directory) is "..".

If you were in the NYStats.html file and were referring to the original document US.html, your link would look like this:

    <A HREF="../US.html">United States</A>
In general, you should use relative links because:
  1. it's easier to move a group of documents to another location (because the relative path names will still be valid)
  2. it's more efficient connecting to the server
  3. there is less to type

However use absolute pathnames when linking to documents that are not directly related. For example, consider a group of documents that comprise a user manual. Links within this group should be relative links. Links to other documents (perhaps a reference to related software) should use full path names. This way if you move the user manual to a different directory, none of the links would have to be updated.

BASE

Found in the <HEAD> section of the code, this element is used to record the URL of the document itself. If BASE is specified, partial URLs within the document are treated as relative to the BASE address.

If the BASE element is absent the document viewer assumes the base URL to be one used to access the document.

<HEAD>
  <BASE HREF="http://www.edu/stuff/dir">
  ... 
</HEAD>
<BODY>
  <A HREF="junk.html">
</BODY>
Where junk.html's full URL is "http://www.edu/stuff/dir/junk.html"

NAME

Particular places in an HTML document can be marked as specific destinations of hypertext links via the NAME attribute. For example, suppose a place in a document is marked via the anchor
<A NAME="proj1">Project 1</A>
From within this document we can create a hypertext link to this place by specifying the anchor:
<A HREF="#proj1">(see Project 1)</A>
If we wanted to reference this place from another document in the same directory we would put
<A HREF="document.html#proj1">(see Project 1)</A>

URL's

HyperText Transfer Protocol (HTTP)

HTTP is the Internet protocol specifically designed for use with the World Wide Web, and thus will be the most common scheme you are likely to use. Its syntax is:

http://<host>:<port>/<path>?<searchpart>

The host is the Internet address of the WWW server, and the port is the port number to connect to. In most cases, the port can be omitted (along with the preceding colon), and it defaults to the standard "80". The path tells the WWW server which file you want, and if omitted, indicates that you want the "home page" for the system. The searchpart may be used to pass information to the server, often to an executable CGI script, but for most WWW documents is not used. Generally, this part of the URL is omitted, along with the preceding question-mark.

Example:

   <A HREF="http://www.cs.buffalo.edu/~kempton">Jill's homepage</a>
Jill's homepage

Gopher Protocol (Gopher)

The Gopher protocol syntax is very similar to FTP and HTTP:

gopher://<host>:<port>/<gopher-path>

The host indicates the Internet address of the Gopher server, while the port, as in the previous cases, can generally be omitted along with its preceding colon. The gopher-path specifies the type of Gopher resource, a selector string, and perhaps other information. A detailed discussion of Gopher queries is not within the scope of this document, but generally you can determine a document's gopher-path from information provided by your browser.

Example:
   <A HREF="gopher://ubunix.acsu.buffalo.edu">Gopher at UB</a>
Gopher at UB

Mailto

You can make it easy for a reader to send electronic mail to a specific person or mail alias by including the mailto attribute in a hyperlink.The format is:

<A HREF="mailto:emailinfo@host">Name</a>
For example, enter:

<A HREF="mailto:kempton@acsu.buffalo.edu">Mail Me </a>

to create a mail window that is already configured to send mail to Jill Marie Kempton like this one:Mail Me
(You, of course, will enter another mail address!)

File Transfer Protocol (FTP)

FTP is a well-used means for transmitting files over the Internet. While there are many advantages to using HTTP instead, many systems don't offer full support of HTTP and clients are not as well developed as they are for FTP. Thus, many times files are distributed via FTP. Its syntax is:

ftp://<user>:<password>@<host>:<port>/< cwd1>/<cwd2>/.../<cwdN>/<name>

If contacting a site which provides general FTP access, the user and password can be omitted, including the colon between them and the at-symbol afterwards. The host is the Internet address of the FTP site. The port and its preceding colon can be omitted as well. The portion of "<cwd1>/<cwd2>/.../<cwdN>" refers to the series of "change directory" commands a client must use to move to the directory in which the file desired resides. The name is the name filename of the desired file.

Example:

   <A HREF="ftp://ftp.cs.buffalo.edu">FTP armstrong </a>
FTP armstrong

Usenet News (News)

The News URL scheme allows for the referencing of Usenet newsgroups or specific articles. The syntax is either of the following:

news:<newsgroup-name>
news:<message-id>

The newsgroup-name is the Usenet newsgroup name (e.g. comp.infosystems.www.providers) and generally will tell the browser to retrieve the titles of all the available articles within that newsgroup. If the newsgroup-name is "*", the URL refers to "all available newsgroups." The message-id corresponds to the Message-ID of the specific article to obtain, and can be found within the article's header information.

Note that the News URL does not specify how a client is to obtain this information. A client must be properly configured to know where to obtain Usenet newsgroups and articles, generally from a specific NNTP server.

Example:
   <A HREF="news:sunyab.cs.305">News for cs305</a>
News for cs305

Telnet to Remote Host (Telnet)

The Telnet URL designates an interactive session to a remote host on the Internet via the Telnet protocol. Its syntax is:

telnet://<user>:<password>@<host>:<port>/

The user and password tokens can be omitted, and are included only for advisory purposes. The host refers to the site to connect to, and port can be omitted, defaulting to the standard "23".

Example:
   <A HREF="TELNET://armstrong.cs.buffalo.edu">Telnet to armstrong </a>
Telnet to armstrong

Inline Images

Most Web browsers can display inline images (that is, images next to text) that are in X Bitmap (XBM), GIF, or JPEG format. Other image formats are being incorporated into Web browsers [e.g., the Portable Network Graphic (PNG) format]. Each image takes time to process and slows down the initial display of a document. Carefully select your images and the number of images in a document.

To include an inline image, enter:

    <IMG SRC=ImageName>

where ImageName is the URL of the image file.

The syntax for <IMG SRC> URLs is identical to that used in an anchor HREF. If the image file is a GIF file, then the filename part of ImageName must end with .gif. Filenames of X Bitmap images must end with .xbm; JPEG image files must end with .jpg or .jpeg; and Portable Network Graphic files must end with .png.

Image Size Attributes

You should include two other attributes on <IMG> tags to tell your browser the size of the images it is downloading with the text. The HEIGHT and WIDTH attributes let your browser set aside the appropriate space (in pixels) for the images as it downloads the rest of the file. (Get the pixel size from your image-processing software, such as Adobe Photoshop.)

For example, to include a self portrait image in a file along with the portrait's dimensions, enter:

    <IMG SRC="goofy.gif" HEIGHT=500 WIDTH=150> or
    <IMG SRC="goofy.gif" HEIGHT=200 WIDTH=650>
    incomparison to:
    <IMG SRC="goofy.gif" >
    <IMG SRC="goofy.gif" HEIGHT=100% WIDTH=100%> or
   
or
in comparison to:

NOTE: Some browsers use the HEIGHT and WIDTH attributes to stretch or shrink an image to fit into the allotted space when the image does not exactly match the attribute numbers. Not all browser developers think stretching/shrinking is a good idea. So don't plan on your readers having access to this feature. Check your dimensions and use the correct ones.

Aligning Images

You have some flexibility when displaying images. You can have images separated from text and aligned to the left or right or centered. Or you can have an image aligned with text. Try several possibilities to see how your information looks best.

Aligning Text with an Image
By default the bottom of an image is aligned with the following text, as shown in this paragraph. You can align images to the top or center of a paragraph using the ALIGN= attributes TOP and CENTER.

This text is aligned with the top of the image (<IMG SRC = "home.gif" ALIGN=TOP>). Notice how the browser aligns only one line and then jumps to the bottom of the image for the rest of the text.

And this text is centered on the image (<IMG SRC = "home.gif" ALIGN=CENTER>). Again, only one line of text is centered; the rest is below the image.

Images without Text
To display an image without any associated text (e.g., your organization's logo), make it a separate paragraph. Use the paragraph ALIGN= attribute to center the image or adjust it to the right side of the window as shown below:

<p ALIGN=CENTER>
<IMG SRC = "home.gif">
</p>
which results in:

The image is centered; this paragraph starts below it and left justified.

Alternate Text for Images

Some World Wide Web browsers--primarily those that run on VT100 terminals--cannot display images. Some users turn off image loading even if their software can display images (especially if they are using a modem or have a slow connection). HTML provides a mechanism to tell readers what they are missing on your pages.

The ALT attribute lets you specify text to be displayed instead of an image. For example:

    <IMG SRC="UpArrow.gif" ALT="Up">

where UpArrow.gif is the picture of an upward pointing arrow. With graphics-capable viewers that have image-loading turned on, you see the up arrow graphic. With a VT100 browser or if image-loading is turned off, the word Up is shown in your window.

You should try to include alternate text for each image you use in your document, which is a courtesy for your readers.

Background Graphics

Newer versions of Web browsers can load an image and use it as a background when displaying a page. Some people like background images and some don't. In general, if you want to include a background, make sure your text can be read easily when displayed on top of the image.

Background images can be a texture (linen finished paper, for example) or an image of an object (a logo possibly). You create the background image as you do any image.

However you only have to create a small piece of the image. Using a feature called tiling, a browser takes the image and repeats it across and down to fill your browser window. In sum you generate one image, and the browser replicates it enough times to fill your window. This action is automatic when you use the background tag shown below.

The tag to include a background image is included in the <BODY> statement as an attribute:

<BODY BACKGROUND="filename.gif">

Comments

The definition of an HTML comment is basically as follows:

A comment declaration starts with <!, followed by zero or more comments, followed by >. A comment starts and ends with "--", and does not contain any occurrence of "--".
This means that the following are all legal comments:
  1. <!-- Hello -->
  2. <!-- Hello -- -- Hello-->
  3. <!---->
  4. <!------ Hello -->
  5. <!>
Note that an "empty" comment tag, with just "--" characters, should always have a multiple of four "-" characters to be legal. (And yes, <!> is also a legal comment - it's the empty comment).

Not all HTML parsers get this right. For example, "<!------> hello-->" is a legal comment, as you can verify with the rule above. It is a comment tag with two comments; the first is empty and the second one contains "> hello". If you try it in a browser, you will find that the text is displayed on screen.

There are two possible reasons for this:

  1. The browser sees the ">" character and thinks the comment ends there.
  2. The browser sees the "-->" text and thinks the comment ends there.
There is also the problem with the "--" sequence. Some people have a habit of using things like "<!-------------->" as separators in their source. Unfortunately, in most cases, the number of "-" characters is not a multiple of four. This means that a browser who tries to get it right will actually get it wrong here and actually hide the rest of the document.

For this reason, use the following simple rule to compose valid and accepted comments:

An HTML comment begins with "<!--", ends with "-->" and does not contain "--" or ">" anywhere in the comment.

Table of Contents