The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience:
The ties between the three mechanisms are apparent throughout this specification.
Every resource available on the Web -- HTML document, image, video clip, program, etc. -- has an address that may be encoded by a Universal Resource Identifier, or "URI".
URIs typically consist of three pieces:
Consider the URI that designates the W3C Technical Reports page:
http://www.w3.org/TR
This URI may be read as follows: There is a document available via the HTTP protocol (see [RFC2616]), residing on the machine www.w3.org, accessible via the path "/TR". Other schemes you may see in HTML documents include "mailto" for email and "ftp" for FTP.
Here is another example of a URI. This one refers to a user's mailbox:
...this is text...
For all comments, please send email to
<A href="mailto:joe@someplace.com">Joe Cool</A>.
Note. Most readers may be familiar with the term "URL" and not the term "URI". URLs form a subset of the more general URI naming scheme.
This section of the document introduces SGML and discusses its relationship to HTML. A complete discussion of SGML is left to the standard (see [ISO8879]).
SGML is a system for defining markup languages. Authors mark up their documents by representing structural, presentational, and semantic information alongside content. HTML is one example of a markup language. Here is an example of an HTML document:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>My first HTML document</TITLE>
</HEAD>
<BODY>
<P>Hello world!
</BODY>
</HTML>
An HTML document is divided into a head section (here, between <HEAD> and </HEAD>) and a body (here, between <BODY> and </BODY>). The title of the document appears in the head (along with other information about the document), and the content of the document appears in the body. The body in this example contains just one paragraph, marked up with <P>.
Each markup language defined in SGML is called an SGML application. An SGML application is generally characterized by:
This specification includes an SGML declaration, three document type definitions (see the section on HTML version information for a description of the three), and a list of character references.
[...] XML is an
application profile or restricted form of SGML, the Standard
Generalized Markup
Language [ISO
8879]. By construction, XML documents are conforming
SGML documents.
XML
is designed to be a subset of SGML, in that every XML document should
also
be a conforming SGML document. For a detailed comparison of the
additional
restrictions that XML places on documents beyond those of SGML, see [Clark].