Links in HTML documents

12 Links

HTML offers many of the conventional publishing idioms for rich text and
structured documents, but what separates it from most other markup languages is
its features for hypertext and interactive documents. This section introduces
the link (or hyperlink, or Web link), the basic hypertext construct. A
link is a connection from one Web resource to another. Although a simple
concept, the link has been one of the primary forces driving the success of the
Web.

A
link has two ends — called anchors — and a
direction. The link starts at the “source” anchor and points to the
“destination” anchor, which may be any Web resource (e.g., an image, a video
clip, a sound bite, a program, an HTML document, an element within an HTML
document, etc.).

12.1.1 Visiting a linked resource

The default behavior associated with a link is the retrieval of
another Web resource. This behavior is commonly and implicitly
obtained by selecting the link (e.g., by clicking, through keyboard input,
etc.).

The following HTML excerpt contains , one
whose destination anchor is an HTML document named “chapter2.html” and the
other whose destination anchor is a GIF image in the file “forest.gif”:

<BODY>
...some text...
<P>You'll find a lot more in  <A href="chapter2.html">chapter two</A>. 
See also this <A href="../images/forest.gif">map of the enchanted forest.</A>
</BODY>

By activating these links (by clicking with the mouse, through keyboard
input, voice commands, etc.), users may visit these resources. Note that the href
attribute in each source anchor specifies the address of the destination anchor
with a URI.

The destination anchor of a link may be an element within an HTML document.
The destination anchor must be given an anchor name and any URI addressing this
anchor must include the name as its fragment identifier.

Destination anchors in HTML documents may be specified either by the A
element (naming it with the
name attribute), or by any other element
(naming with the id attribute).

Thus, for example, an author might create a table of contents whose entries
link to header elements H2, H3, etc., in the same document. Using the A element to
create destination anchors, we would write:

<H1>Table of Contents</H1>
<P><A href="#section1">Introduction</A><BR>
<A href="#section2">Some background</A><BR>
<A href="#section2.1">On a more personal note</A><BR>
...the rest of the table of contents...
...the document body...
<H2><A name="section1">Introduction</A></H2>
...section 1...
<H2><A name="section2">Some background</A></H2>
...section 2...
<H3><A name="section2.1">On a more personal note</A></H3>
...section 2.1...

We may achieve the same effect by making the header elements themselves the
anchors:

<H1>Table of Contents</H1>
<P><A href="#section1">Introduction</A><BR>
<A href="#section2">Some background</A><BR>
<A href="#section2.1">On a more personal note</A><BR>
...the rest of the table of contents...
...the document body...
<H2 id="section1">Introduction</H2>
...section 1...
<H2 id="section2">Some background</H2>
...section 2...
<H3 id="section2.1">On a more personal note</H3>
...section 2.1...

By far the most common use of a link is to retrieve another Web
resource, as illustrated in the previous examples. However, authors may insert
links in their documents that express other relationships between resources
than simply “activate this link to visit that related resource”. Links that
express other types of relationships have one or more link types specified in their source anchor.

The roles of a link defined by A or LINK are specified via the rel
and
rev attributes.

For instance, links defined by the LINK element may describe the position
of a document within a series of documents. In the following excerpt, links
within the document entitled “Chapter 5” point to the previous and next
chapters:

<HEAD>
...other head information...
<TITLE>Chapter 5</TITLE>
<LINK rel="prev" href="chapter4.html">
<LINK rel="next" href="chapter6.html">
</HEAD>

The link type of the first link is “prev” and that of the second is “next”
(two of several recognized link types).
Links specified by LINK are not rendered with the document’s
contents, although user agents may render them in other ways (e.g., as
navigation tools).

Even if they are not used for navigation, these links may be interpreted in
interesting ways. For example, a user agent that prints a series of HTML
documents as a single document may use this link information as the basis of
forming a coherent linear document. Further information is given below on using
links for the benefit of search engines.

12.1.3 Specifying anchors and links

Although several HTML elements and attributes create links to other
resources (e.g., the IMG element, the
FORM element, etc.), this chapter discusses links and anchors
created by the LINK and A elements. The LINK element may only appear in the
head of a document. The A element may only appear in the body.

When the
A element’s href attribute is set, the element defines a source
anchor for a link that may be activated by the user to retrieve a Web resource.
The source anchor is the location of the A instance and the destination anchor
is the Web resource.

The retrieved resource may be handled by the user agent in several ways: by
opening a new HTML document in the same user agent window, opening a new HTML
document in a different window, starting a new program to handle the resource,
etc. Since the
A element has content (text, images, etc.), user agents may render
this content in such a way as to indicate the presence of a link (e.g., by
underlining the content).

When the name or id attributes of the A element are set, the element
defines an anchor that may be the destination of other links.

Authors may set the name and href attributes simultaneously in the
same
A instance.

The
LINK element defines a relationship between the current document and
another resource. Although LINK has no content, the relationships it defines may
be rendered by some user agents.

The
title attribute may be set for both A and LINK to
add information about the nature of a link. This information may be spoken by a
user agent, rendered as a tool tip, cause a change in cursor image, etc.

Thus, we may augment a previous example by
supplying a title for each link:

<BODY>
...some text...
<P>You'll find a lot more in <A href="chapter2.html"
       title="Go to chapter two">chapter two</A>.
<A href="./chapter2.html"
       title="Get chapter two.">chapter two</A>. 
See also this <A href="../images/forest.gif"
       title="GIF image of enchanted forest">map of
the enchanted forest.</A>
</BODY>

Since links may point to documents encoded with different character encodings, the A and LINK
elements support the charset attribute. This attribute allows authors to
advise user agents about the encoding of data at the other end of the link.

The hreflang attribute provides user agents with
information about the language of a resource at the end of a link, just as the

lang attribute provides information about the language of an
element’s content or attribute values.

Armed with this additional knowledge, user agents should be able to avoid
presenting “garbage” to the user. Instead, they may either locate resources
necessary for the correct presentation of the document or, if they cannot
locate the resources, they should at least warn the user that the document will
be unreadable and explain the cause.

12.2 The
A element

<!ELEMENT A - - (%inline;)* -(A)       -- anchor -->
<!ATTLIST A
  %attrs;                              -- %coreattrs, %i18n, %events --
  charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  type        %ContentType;  #IMPLIED  -- advisory content type --
  name        CDATA          #IMPLIED  -- named link end --
  href        %URI;          #IMPLIED  -- URI for linked resource --
  hreflang    %LanguageCode; #IMPLIED  -- language code --
  rel         %LinkTypes;    #IMPLIED  -- forward link types --
  rev         %LinkTypes;    #IMPLIED  -- reverse link types --
  accesskey   %Character;    #IMPLIED  -- accessibility key character --
  shape       %Shape;        rect      -- for use with client-side image maps --
  coords      %Coords;       #IMPLIED  -- for use with client-side image maps --
  tabindex    NUMBER         #IMPLIED  -- position in tabbing order --
  onfocus     %Script;       #IMPLIED  -- the element got the focus --
  onblur      %Script;       #IMPLIED  -- the element lost the focus --
  >

Start tag: required, End tag:
required

Attribute definitions

name = cdata [CS]
This attribute names the current anchor so that it may be the destination
of another link. The value of this attribute must be a unique anchor name. The
scope of this name is the current document. Note that this attribute shares the
same name space as the id attribute.
href = uri [CT]
This attribute specifies the location of a Web resource, thus defining a
link between the current element (the source anchor) and the destination anchor
defined by this attribute.
hreflang = langcode [CI]
This attribute specifies the base language of the resource designated by href
and may only be used when href is specified.
type = content-type [CI]
This attribute gives an advisory hint as to the content type of the content
available at the link target address. It allows user agents to opt to use a
fallback mechanism rather than fetch the content if they are advised that they
will get content in a content type they do not support.
Authors who use this attribute take responsibility to manage the risk that
it may become inconsistent with the content available at the link target
address.
For the current list of registered content types, please consult
[MIMETYPES].
rel = link-types [CI]
This attribute describes the relationship from the current document to the
anchor specified by the href attribute. The value of this attribute is a
space-separated list of link types.
rev = link-types [CI]
This attribute is used to describe a reverse link
from the anchor specified by the
href attribute to the current document. The
value of this attribute is a space-separated list of link types.
charset = charset [CI]
This attribute specifies the character encoding of the resource designated
by the link. Please consult the section on character
encodings for more details.

Each
A element defines an anchor

  1. The
    A element’s content defines the position of the anchor.
  2. The
    name attribute names the anchor so that it may be the destination of
    zero or more links (see also anchors with
    id).
  3. The
    href attribute makes this anchor the source anchor of exactly one
    link.

Authors may also create an A element that specifies no anchors, i.e., that
doesn’t specify href,
name, or id. Values for these attributes may be
set at a later time through scripts.

In the example that follows, the
A element defines a link. The source anchor is
the text “W3C Web site” and the destination anchor is “http://www.w3.org/”:

For more information about W3C, please consult the 
<A href="http://www.w3.org/">W3C Web site</A>. 

This link designates the home page of the World Wide Web Consortium. When a
user activates this link in a user agent, the user agent will retrieve the
resource, in this case, an HTML document.

User agents generally render links in such a way as to make them
obvious to users (underlining, reverse video, etc.). The exact rendering
depends on the user agent. Rendering may vary according to whether the user has
already visited the link or not. A possible visual rendering of the previous
link might be:

For more information about W3C, please consult the W3C Web site.
                                                   ~~~~~~~~~~~~

To tell user agents explicitly what the character encoding of the
destination page is, set the
charset attribute:

For more information about W3C, please consult the 
<A href="http://www.w3.org/" charset="ISO-8859-1">W3C Web site</A> 

Suppose we define an anchor named “anchor-one” in the file “one.html”.

...text before the anchor...
<A name="anchor-one">This is the location of anchor one.</A>
...text after the anchor...

This creates an anchor around the text “This is the location of anchor
one.”. Usually, the contents of
A are not rendered in any special way when A
defines an anchor only.

Having defined the anchor, we may link to it from the same or another
document. URIs that designate anchors contain a “#” character followed by the
anchor name (the fragment
identifier). Here are some examples of such URIs:

  • An absolute URI: http://www.mycompany.com/one.html#anchor-one
  • A relative URI: ./one.html#anchor-one or
    one.html#anchor-one
  • When the link is defined in the same document: #anchor-one

Thus, a link defined in the file “two.html” in the same directory as
“one.html” would refer to the anchor as follows:

...text before the link...
For more information, please consult <A href="./one.html#anchor-one"> anchor one</A>.
...text after the link...

The
A element in the following example specifies a link (with href)
and creates a named anchor (with name) simultaneously:

I just returned from vacation! Here's a
<A name="anchor-two" 
   href="http://www.somecompany.com/People/Ian/vacation/family.png">
photo of my family at the lake.</A>.

This example contains a link to a different type of Web resource (a PNG
image). Activating the link should cause the image resource to be retrieved
from the Web (and possibly displayed if the system has been configured to do
so).

Note. User agents should be able to find anchors
created by empty A elements, but some fail to do so. For example, some user
agents may not find the “empty-anchor” in the following HTML fragment:

<A name="empty-anchor"></A>
<EM>...some HTML...</EM>
<A href="#empty-anchor">Link to empty anchor</A>

An anchor name is the value of either the name or id
attribute when used in the context of anchors. Anchor names must observe the
following rules:

  • Uniqueness:
  • String matching:fragment identifiers and
    anchor names must be done by exact (case-sensitive) match.

Thus, the following example is correct with respect to string matching and
must be considered a match by user agents:

<P><A href="#xxx">...</A>
...more document...
<P><A name="xxx">...</A>

ILLEGAL EXAMPLE:

The following example is illegal with respect to uniqueness since the two names
are the same except for case:

<P><A name="xxx">...</A>
<P><A name="XXX">...</A>

Although the following excerpt is legal HTML, the behavior of the user agent
is not defined; some user agents may (incorrectly) consider this a match and
others may not.

<P><A href="#xxx">...</A>
...more document...
<P><A name="XXX">...</A>

Anchor names should be restricted to ASCII
characters. Please consult the appendix for more information about
non-ASCII characters in URI
attribute values.

Links and anchors defined by the
A element must not be nested; an A element
must not contain any other A elements.

Since the DTD defines the
LINK element to be empty, LINK
elements may not be nested either.

The
id attribute may be used to create an anchor at the start tag of any
element (including the A element).

This example illustrates the use of the id attribute to position an anchor in
an
H2 element. The anchor is linked to via the A
element.

You may read more about this in <A href="#section2">Section Two</A>.
...later in the document
<H2 id="section2">Section Two</H2>
...later in the document
<P>Please refer to <A href="#section2">Section Two</A> above
for more details.

The following example names a destination anchor with the id
attribute:

I just returned from vacation! Here's a
<A id="anchor-two">photo of my family at the lake.</A>.

The
id and name attributes share the same
name space. This means that they cannot both
define an anchor with the same name in the same document. It is permissible to
use both attributes to specify an element’s unique identifier for the following
elements:
A, APPLET, FORM,
FRAME,
IFRAME, IMG, and MAP. When
both attributes are used on a single element, their values must be
identical.

ILLEGAL EXAMPLE:

The following excerpt is illegal HTML since these attributes declare the same
name twice in the same document.

<A href="#a1">...</A>
...
<H1 id="a1">
...pages and pages...
<A name="a1"></A>

The following example illustrates that id and name
must be the same when both appear in an element’s start tag:

<P><A name="a1" id="a1" href="#a1">...</A>

Because of its specification in the HTML DTD, the name
attribute may contain
character references. Thus, the value Dürst is a
valid
name attribute value, as is D&uuml;rst . The id
attribute, on the other hand, may not contain character references.

Use id or name? Authors should consider the following
issues when deciding whether to use id or
name for an anchor name:

  • The
    id attribute can act as more than just an anchor name (e.g., style
    sheet selector, processing identifier, etc.).
  • Some older user agents don’t support anchors created with the id
    attribute.
  • The name attribute allows richer anchor names (with entities).

A reference to an unavailable or unidentifiable resource is an error.
Although user agents may vary in how they handle such an error, we recommend
the following behavior:

  • If a user agent cannot locate a linked resource, it should alert the
    user.
  • If a user agent cannot identify the type of a linked resource, it should
    still attempt to process it. It should alert the user and may allow the user to
    intervene and identify the document type.

12.3 Document relationships: the
LINK element

<!ELEMENT LINK - O EMPTY               -- a media-independent link -->
<!ATTLIST LINK
  %attrs;                              -- %coreattrs, %i18n, %events --
  charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  href        %URI;          #IMPLIED  -- URI for linked resource --
  hreflang    %LanguageCode; #IMPLIED  -- language code --
  type        %ContentType;  #IMPLIED  -- advisory content type --
  rel         %LinkTypes;    #IMPLIED  -- forward link types --
  rev         %LinkTypes;    #IMPLIED  -- reverse link types --
  media       %MediaDesc;    #IMPLIED  -- for rendering on these media --
  >

Start tag: required, End tag:
forbidden

This element defines a link. Unlike A, it may only appear in the HEAD
section of a document, although it may appear any number of times. Although LINK
has no content, it conveys relationship information that may be rendered by
user agents in a variety of ways (e.g., a tool-bar with a drop-down menu of
links).

This example illustrates how several LINK definitions may appear in the HEAD
section of a document. The current document is “Chapter2.html”. The rel
attribute specifies the relationship of the linked document with the current
document. The values “Index”, “Next”, and “Prev” are explained in the section
on link types.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
  <TITLE>Chapter 2</TITLE>
  <LINK rel="Index" href="../index.html">
  <LINK rel="Next"  href="Chapter3.html">
  <LINK rel="Prev"  href="Chapter1.html">
</HEAD>
...the rest of the document...

The
rel and rev attributes play complementary roles — the rel
attribute specifies a forward link and the rev attribute specifies a reverse
link.

Consider two documents A and B.

Document A:       <LINK href="docB" rel="foo">

Has exactly the same meaning as:

Document B:       <LINK href="docA" rev="foo">

Both attributes may be specified simultaneously.

When the
LINK element links an external style sheet to a document, the
type attribute specifies the style sheet language and the
media attribute specifies the intended rendering medium or media.
User agents may save time by retrieving from the network only those style
sheets that apply to the current device.

Media types are further
discussed in the section on style sheets.

Authors may use the LINK element to provide a variety of information to
search engines, including:

  • Links to alternate versions of a document, written in another human
    language.
  • Links to alternate versions of a document, designed for different media,
    for instance a version especially suited for printing.
  • Links to the starting page of a collection of documents.

The examples below illustrate how language information, media types, and
link types may be combined to improve document handling by search engines.

In the following example, we use the hreflang attribute to tell search
engines where to find Dutch, Portuguese, and Arabic versions of a document.
Note the use of the charset attribute for the Arabic manual. Note also the
use of the
lang attribute to indicate that the value of the title
attribute for the LINK element designating the French manual is in French.

<HEAD>
<TITLE>The manual in English</TITLE>
<LINK title="The manual in Dutch"
      type="text/html"
      rel="alternate"
      hreflang="nl" 
      href="http://someplace.com/manual/dutch.html">
<LINK title="The manual in Portuguese"
      type="text/html"
      rel="alternate"
      hreflang="pt" 
      href="http://someplace.com/manual/portuguese.html">
<LINK title="The manual in Arabic"
      type="text/html"
      rel="alternate"
      charset="ISO-8859-6"
      hreflang="ar" 
      href="http://someplace.com/manual/arabic.html">
<LINK lang="fr" title="La documentation en Fran&ccedil;ais"
      type="text/html"
      rel="alternate"
      hreflang="fr"
      href="http://someplace.com/manual/french.html">
</HEAD>

In the following example, we tell search engines where to find the printed
version of a manual.

<HEAD>
<TITLE>Reference manual</TITLE>
<LINK media="print" title="The manual in postscript"
      type="application/postscript"
      rel="alternate"
      href="http://someplace.com/manual/postscript.ps">
</HEAD>

In the following example, we tell search engines where to find the front
page of a collection of documents.

<HEAD>
<TITLE>Reference manual -- Page 5</TITLE>
<LINK rel="Start" title="The first page of the manual"
      type="text/html"
      href="http://someplace.com/manual/start.html">
</HEAD>

Further information is given in the notes in the appendix on helping search engines index your Web
site.

12.4 Path information: the BASE element

<!ELEMENT BASE - O EMPTY               -- document base URI -->
<!ATTLIST BASE
  href        %URI;          #REQUIRED -- URI that acts as base URI --
  >

Start tag: required, End tag:
forbidden

Attribute definitions

href = uri [CT]
This attribute specifies an absolute URI that acts as the base URI for
resolving relative URIs.

Attributes defined elsewhere

  • target (target
    frame information)

In HTML, links and references to external images, applets, form-processing
programs, style sheets, etc. are always specified by a URI. Relative URIs are
resolved according to a base URI, which
may come from a variety of sources. The BASE element allows authors to specify a document’s base URI explicitly.

When present, the BASE element must appear in the HEAD
section of an HTML document, before any element that refers to an external
source. The path information specified by the BASE element only affects URIs in
the document where the element appears.

For example, given the following BASE declaration and A
declaration:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<HTML>
 <HEAD>
   <TITLE>Our Products</TITLE>
   <BASE href="http://www.aviary.com/products/intro.html">
 </HEAD>

 <BODY>
   <P>Have you seen our <A href="../cages/birds.gif">Bird Cages</A>?
 </BODY>
</HTML>

the relative URI “../cages/birds.gif” would resolve to:

http://www.aviary.com/cages/birds.gif

User agents must calculate the base URI for resolving relative URIs
according to [RFC1808], section 3. The following describes how
[RFC1808] applies specifically to HTML.

User agents must calculate the base URI according to the following
precedences (highest priority to lowest):

  1. The base URI is set by the
    BASE element.
  2. The base URI is given by meta data discovered during a protocol
    interaction, such as an HTTP header (see [RFC2616]).
  3. By default, the base URI is that of the current document. Not all HTML
    documents have a base URI (e.g., a valid HTML document may appear in an email
    and may not be designated by a URI). Such HTML documents are considered
    erroneous if they contain relative URIs and rely on a default base URI.

Additionally, the OBJECT and APPLET elements define attributes that
take precedence over the value set by the BASE element. Please consult the
definitions of these elements for more information about URI issues specific to
them.

Note. For versions of HTTP that define a Link header,
user agents should handle these headers exactly as LINK
elements in the document. HTTP 1.1 as defined by [RFC2616] does not
include a Link header field (refer to section 19.6.3).