http://www.w3.org/TR/html4/intro/intro.html

2 Introduction to HTML 4

[...]  



2.1 What is the World Wide Web?

The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience:

  1. A uniform naming scheme for locating resources on the Web (e.g., URIs).
  2. Protocols, for access to named resources over the Web (e.g., HTTP).
  3. Hypertext, for easy navigation among resources (e.g., HTML).

The ties between the three mechanisms are apparent throughout this specification.

2.1.1 Introduction to URIs

Every resource available on the Web -- HTML document, image, video clip, program, etc. -- has an address that may be encoded by a Universal Resource Identifier, or "URI".

URIs typically consist of three pieces:

  1. The naming scheme of the mechanism used to access the resource.
  2. The name of the machine hosting the resource.
  3. The name of the resource itself, given as a path.

Consider the URI that designates the W3C Technical Reports page:

   http://www.w3.org/TR

This URI may be read as follows: There is a document available via the HTTP protocol (see [RFC2616]), residing on the machine www.w3.org, accessible via the path "/TR". Other schemes you may see in HTML documents include "mailto" for email and "ftp" for FTP.

Here is another example of a URI. This one refers to a user's mailbox:

   ...this is text...
For all comments, please send email to
<A href="mailto:joe@someplace.com">Joe Cool</A>.

Note. Most readers may be familiar with the term "URL" and not the term "URI". URLs form a subset of the more general URI naming scheme.




http://www.w3.org/TR/html4/intro/sgmltut.html

3 On SGML and HTML

[...]   

This section of the document introduces SGML and discusses its relationship to HTML. A complete discussion of SGML is left to the standard (see [ISO8879]).

3.1 Introduction to SGML

SGML is a system for defining markup languages. Authors mark up their documents by representing structural, presentational, and semantic information alongside content. HTML is one example of a markup language. Here is an example of an HTML document:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<TITLE>My first HTML document</TITLE>
</HEAD>
<BODY>
<P>Hello world!
</BODY>
</HTML>

An HTML document is divided into a head section (here, between <HEAD> and </HEAD>) and a body (here, between <BODY> and </BODY>). The title of the document appears in the head (along with other information about the document), and the content of the document appears in the body. The body in this example contains just one paragraph, marked up with <P>.

Each markup language defined in SGML is called an SGML application. An SGML application is generally characterized by:

  1. An SGML declaration. The SGML declaration specifies which characters and delimiters may appear in the application.
  2. A document type definition (DTD). The DTD defines the syntax of markup constructs. The DTD may include additional definitions such as character entity references.
  3. A specification that describes the semantics to be ascribed to the markup. This specification also imposes syntax restrictions that cannot be expressed within the DTD.
  4. Document instances containing data (content) and markup. Each instance contains a reference to the DTD to be used to interpret it.

This specification includes an SGML declaration, three document type definitions (see the section on HTML version information for a description of the three), and a list of character references.



http://www.w3.org/XML/

Extensible Markup Language (XML)

Introduction

Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879).


http://www.w3.org/TR/2006/REC-xml-20060816/

Extensible Markup Language (XML) 1.0 (Fourth Edition)

[...]  

Abstract

The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.   

[...]  

1 Introduction

[...]  XML is an application profile or restricted form of SGML, the Standard Generalized Markup Language [ISO 8879]. By construction, XML documents are conforming SGML documents. 

[...]    

C XML and SGML (Non-Normative)

XML is designed to be a subset of SGML, in that every XML document should also be a conforming SGML document. For a detailed comparison of the additional restrictions that XML places on documents beyond those of SGML, see [Clark].