source - mozsearch

    <h3 class="no-num no-toc">Stability</h3>

    <p>Different parts of this specification are at different levels of

    maturity.</p>

    <div id="stability"></div>

    <p class="big-issue">Some of the more major known issues are marked

    like this. There are many other issues that have been raised as

    well; the issues given in this document are not the only known

    issues! Also, firing of events needs to be unified (right now some

    bubble, some don't, they all use different text to fire events, we

    don't have an official queueing mechanism, etc).</p>

    <h2 class="no-num no-toc" id="contents">Table of contents</h2>

    <!--toc-->

    <hr>

    <h2 id="introduction">Introduction</h2>

    <h3>Background</h3>

    <p><em>This section is non-normative.</em></p>

    <p>The World Wide Web's markup language has always been HTML. HTML

    was primarily designed as a language for semantically describing

    scientific documents, although its general design and adaptations

    over the years has enabled it to be used to describe a number of

    other types of documents.</p>

    <p>The main area that has not been adequately addressed by HTML is a

    vague subject referred to as Web Applications. This specification

    attempts to rectify this, while at the same time updating the HTML

    specifications to address issues raised in the past few years.</p>

    <h3>Scope</h3>

    <p><em>This section is non-normative.</em></p>

    <p>This specification is limited to providing a semantic-level

    markup language and associated semantic-level scripting APIs for

    authoring accessible pages on the Web ranging from static documents

    to dynamic applications.</p>

    <p>The scope of this specification does not include providing

    mechanisms for media-specific customization of presentation

    (although default rendering rules for Web browsers are included at

    the end of this specification, and several mechanisms for hooking

    into CSS are provided as part of the language).</p>

    <p>The scope of this specification does not include documenting

    every HTML or DOM feature supported by Web browsers. Browsers

    support many features that are considered to be very bad for

    accessibility or that are otherwise inappropriate. For example, the

    <code>blink</code> element is clearly presentational and authors

    wishing to cause text to blink should instead use CSS.</p>

    <p>The scope of this specification is not to describe an entire

    operating system. In particular, hardware configuration software,

    image manipulation tools, and applications that users would be

    expected to use with high-end workstations on a daily basis are out

    of scope. In terms of applications, this specification is targeted

    specifically at applications that would be expected to be used by

    users on an occasional basis, or regularly but from disparate

    locations, with low CPU requirements. For instance online purchasing

    systems, searching systems, games (especially multiplayer online

    games), public telephone books or address books, communications

    software (e-mail clients, instant messaging clients, discussion

    software), document editing software, etc.</p>

    <p>For sophisticated cross-platform applications, there already

    exist several proprietary solutions (such as Mozilla's XUL, Adobe's

    Flash, or Microsoft's Silverlight). These solutions are evolving

    faster than any standards process could follow, and the requirements

    are evolving even faster. These systems are also significantly more

    complicated to specify, and are orders of magnitude more difficult

    to achieve interoperability with, than the solutions described in

    this document. Platform-specific solutions for such sophisticated

    applications (for example the Mac OS X Core APIs) are even further

    ahead.</p>

    <h3>Relationships to other specifications</h3>

    <h4>Relationship to HTML 4.01 and DOM2 HTML</h4>

    <p><em>This section is non-normative.</em></p>

    <p>This specification represents a new version of HTML4, along with

    a new version of the associated DOM2 HTML API. Migration from HTML4

    to the format and APIs described in this specification should in

    most cases be straightforward, as care has been taken to ensure that

    backwards-compatibility is retained. <a

    href="#refsHTML4">[HTML4]</a> <a

    href="#refsDOM2HTML">[DOM2HTML]</a></p>

    <h4>Relationship to XHTML 1.x</h4>

    <p><em>This section is non-normative.</em></p>

    <p>This specification is intended to replace XHTML 1.0 as the

    normative definition of the XML serialization of the HTML

    vocabulary. <a href="#refsXHTML10">[XHTML10]</a></p>

    <p>While this specification updates the semantics and requirements

    of the vocabulary defined by XHTML Modularization 1.1 and used by

    XHTML 1.1, it does not attempt to provide a replacement for the

    modularization scheme defined and used by those (and other)

    specifications, and therefore cannot be considered a complete

    replacement for them. <a href="#refsXHTMLMOD">[XHTMLMOD]</a> <a

    href="#refsXHTML11">[XHTML11]</a></p>

    <p>Thus, authors and implementors who do not need such a

    modularization scheme can consider this specification a replacement

    for XHTML 1.x, but those who do need such a mechanism are encouraged

    to continue using the XHTML 1.1 line of specifications.</p>

    <h4>Relationship to XHTML2</h4>

    <p><em>This section is non-normative.</em></p>

    <p>XHTML2 <a href="#refsXHTML2">[XHTML2]</a> defines a new HTML

    vocabulary with better features for hyperlinks, multimedia content,

    annotating document edits, rich metadata, declarative interactive

    forms, and describing the semantics of human literary works such as

    poems and scientific papers.</p>

    <p>However, it lacks elements to express the semantics of many of

    the non-document types of content often seen on the Web. For

    instance, forum sites, auction sites, search engines, online shops,

    and the like, do not fit the document metaphor well, and are not

    covered by XHTML2.</p>

    <p><em>This</em> specification aims to extend HTML so that it is

    also suitable in these contexts.</p>

    <p>XHTML2 and this specification use different namespaces and

    therefore can both be implemented in the same XML processor.</p>

    <h4>Relationship to Web Forms 2.0 and XForms</h4>

    <p><em>This section is non-normative.</em></p>

    <p>This specification will eventually supplant Web Forms 2.0. The

    current Web Forms 2.0 draft can be considered part of this

    specification for the time being; its features will eventually be

    merged into this specification. <a href="#refsWF2">[WF2]</a></p>

    <p>As it stands today, this specification is unrelated and

    orthognoal to XForms. When the forms features defined in HTML4 and

    Web Forms 2.0 are merged into this specification, then the

    relationship to XForms described in the Web Forms 2.0 draft will

    apply to this specification. <a href="#refsXForms">[XForms]</a></p>

    <h4>Relationship to XUL, Flash, Silverlight, and other proprietary UI languages</h4>

    <p><em>This section is non-normative.</em></p>

    <p>This specification is independent of the various proprietary UI

    languages that various vendors provide. As an open, vendor-neutral

    language, HTML provides for a solution to the same problems without

    the risk of vendor lock-in.</p>

    <h3>HTML vs XHTML</h3>

    <p><em>This section is non-normative.</em></p>

    <p>This specification defines an abstract language for describing

    documents and applications, and some APIs for interacting with

    in-memory representations of resources that use this language.</p>

    <p>The in-memory representation is known as "DOM5 HTML", or "the

    DOM" for short.</p>

    <p>There are various concrete syntaxes that can be used to transmit

    resources that use this abstract language, two of which are defined

    in this specification.</p>

    <p>The first such concrete syntax is "HTML5". This is the format

    recommended for most authors. It is compatible with all legacy Web

    browsers. If a document is transmitted with the MIME type <code

    title="">text/html</code>, then it will be processed as an "HTML5"

    document by Web browsers.</p>

    <p>The second concrete syntax uses XML, and is known as

    "XHTML5". When a document is transmitted with an XML MIME type, such

    as <code title="">application/xhtml+xml</code>, then it is processed

    by an XML processor by Web browsers, and treated as an "XHTML5"

    document. Authors are reminded that the processing for XML and HTML

    differs; in particular, even minor syntax errors will prevent an XML

    document from being rendered fully, whereas they would be ignored in

    the "HTML5" syntax.</p>

    <p>The "DOM5 HTML", "HTML5", and "XHTML5" representations cannot all

    represent the same content. For example, namespaces cannot be

    represented using "HTML5", but they are supported in "DOM5 HTML" and

    "XHTML5". Similarly, documents that use the <code>noscript</code>

    feature can be represented using "HTML5", but cannot be represented

    with "XHTML5" and "DOM5 HTML". Comments that contain the string

    "<code title="">--&gt;</code>" can be represented in "DOM5 HTML" but

    not in "HTML5" and "XHTML5". And so forth.</p>

    <h3>Structure of this specification</h3>

    <p><em>This section is non-normative.</em></p>

    <p>This specification is divided into the following major

    sections:</p>

    <dl>

     <dt><a href="#infrastructure">Common Infrastructure</a></dt>

     <dd>The conformance classes, algorithms, definitions, and the

     common underpinnings of the rest of the specification.</dd>

     <dt><a href="#dom">The DOM</a></dt>

     <dd>Documents are built from elements. These elements form a tree

     using the DOM. This section defines the features of this DOM, as

     well as introducing the features common to all elements, and the

     concepts used in defining elements.</dd>

     <dt><a href="#semantics">Elements</a></dt>

     <dd>Each element has a predefined meaning, which is explained in

     this section. User agent requirements for how to handle each

     element are also given, along with rules for authors on how to use

     the element.</dd>

     <dt><a href="#browsers">Web Browsers</a></dt>

     <dd>HTML documents do not exist in a vacuum &mdash; this section

     defines many of the features that affect environments that deal

     with multiple pages, links between pages, and running scripts.</dd>

     <dt><a href="#editing">User Interaction</a></dt>

     <dd>HTML documents can provide a number of mechanisms for users to

     interact with and modify content, which are described in this

     section.</dd>

     <dt><a href="#comms">The Communication APIs</a></dt>

     <dd>Applications written in HTML often require mechanisms to

     communicate with remote servers, as well as communicating with

     other applications from different domains running on the same

     client.</dd>

     <dt><a href="#repetition">Repetition Templates</a></dt>

     <dd>A mechanism to support repeating sections in forms.</dd>

     <dt><a href="#syntax">The Language Syntax</a></dt>

     <dd>All of these features would be for naught if they couldn't be

     represented in a serialized form and sent to other people, and so

     this section defines the syntax of HTML, along with rules for how

     to parse HTML.</dd>

    </dl>

    <p>There are also a couple of appendices, defining <a

    href="#rendering">rendering rules</a> for Web browsers and listing

    <a href="#no">areas that are out of scope</a> for this

    specification.</p>

    <h4>How to read this specification</h4>

    <p>This specification should be read like all other specifications.

    First, it should be read cover-to-cover, multiple times. Then, it

    should be read backwards at least once. Then it should be read by

    picking random sections from the contents list and following all the

    cross-references.</p>

    <h4>Typographic conventions</h4>

    <p>This is a definition, requirement, or explanation.</p>

    <p class="note">This is a note.</p>

    <p class="example">This is an example.</p>

    <p class="big-issue">This is an open issue.</p>

    <p class="warning">This is a warning.</p>

    <p>The defining instance of a term is marked up like <dfn

    title="x-this">this</dfn>. Uses of that term are marked up like

    <span title="x-this">this</span> or like <i

    title="x-this">this</i>.</p>

    <p>The defining instance of an element, attribute, or API is marked

    up like <dfn title="x-that"><code>this</code></dfn>. References to

    that element, attribute, or API are marked up like <code

    title="x-that">this</code>.</p>

    <p>Other code fragments are marked up <code title="">like

    this</code>.</p>

    <p>Variables are marked up like <var title="">this</var>.</p>

    <pre class="idl">interface <dfn title="">Example</dfn> {

    // this is an IDL definition

  };</pre>

    <h2 id="infrastructure">Common infrastructure</h2>

    <h3>Terminology</h3>

    <p>This specification refers to both HTML and XML attributes and DOM

    attributes, often in the same context. When it is not clear which is

    being referred to, they are referred to as <dfn>content

    attributes</dfn> for HTML and XML attributes, and <dfn>DOM

    attributes</dfn> for those from the DOM. Similarly, the term

    "properties" is used for both ECMAScript object properties and CSS

    properties. When these are ambiguous they are qualified as object

    properties and CSS properties respectively.</p>

    <p>The term <span>HTML documents</span> is sometimes used in

    contrast with <span>XML documents</span> to specifically mean

    documents that were parsed using an <span>HTML parser</span> (as

    opposed to using an XML parser or created purely through the

    DOM).</p>

    <p>Generally, when the specification states that a feature applies

    to HTML or XHTML, it also includes the other. When a feature

    specifically only applies to one of the two languages, it is called

    out by explicitly stating that it does not apply to the other

    format, as in "for HTML, ... (this does not apply to XHTML)".</p>

    <p>This specification uses the term <em>document</em> to

    refer to any use of HTML, ranging from short static documents to

    long essays or reports with rich multimedia, as well as to

    fully-fledged interactive applications.</p>

    <p>For simplicity, terms such as <em>shown</em>, <em>displayed</em>,

    and <em>visible</em> might sometimes be used when referring to the

    way a document is rendered to the user. These terms are not meant to

    imply a visual medium; they must be considered to apply to other

    media in equivalent ways.</p>

    <p>Some of the algorithms in this specification, for historical

    reasons, require the user agent to <dfn>pause</dfn> until some

    condition has been met. While a user agent is paused, it must ensure

    that no scripts execute (e.g. no event handlers, no timers,

    etc). User agents should remain responsive to user input while

    paused, however, albeit without letting the user interact with Web

    pages where that would involve invoking any script.</p>

    <h4>XML</h4>

    <p id="html-namespace">To ease migration from HTML to XHTML, UAs

    conforming to this specification will place elements in HTML in the

    <code>http://www.w3.org/1999/xhtml</code> namespace, at least for

    the purposes of the DOM and CSS. The term "<dfn>elements in the HTML

    namespace</dfn>", or "<dfn>HTML elements</dfn>" for short, when used

    in this specification, thus refers to both HTML and XHTML

    elements.</p>

    <p>Unless otherwise stated, all elements defined or mentioned in

    this specification are in the

    <code>http://www.w3.org/1999/xhtml</code> namespace, and all

    attributes defined or mentioned in this specification have no

    namespace (they are in the per-element partition).</p>

    <p>When an XML name, such as an attribute or element name, is

    referred to in the form <code><var title="">prefix</var>:<var

    title="">localName</var></code>, as in <code>xml:id</code> or

    <code>svg:rect</code>, it refers to a name with the local name <var

    title="">localName</var> and the namespace given by the prefix, as

    defined by the following table:</p>

    <dl>

     <dt><code title="">xml</code></dt>

     <dd><code>http://www.w3.org/XML/1998/namespace</code></dd>

     <dt><code title="">html</code></dt>

     <dd><code>http://www.w3.org/1999/xhtml</code></dd>

     <dt><code title="">svg</code></dt>

     <dd><code>http://www.w3.org/2000/svg</code></dd>

    </dl>

    <p>Attribute names are said to be <dfn>XML-compatible</dfn> if they

    match the <a href="http://www.w3.org/TR/REC-xml/#NT-Name"><code

    title="">Name</code></a> production defined in XML, they contain no

    U+003A COLON (:) characters, and their first three characters are

    not an <span>ASCII case-insensitive</span> match for the string

    "<code title="">xml</code>". <a href="#refsXML">[XML]</a></p>

    <h4>DOM trees</h4>

    <p>The term <dfn>root element</dfn>, when not explicitly qualified

    as referring to the document's root element, means the furthest

    ancestor element node of whatever node is being discussed, or the

    node itself if it has no ancestors. When the node is a part of the

    document, then that is indeed the document's root element; however,

    if the node is not currently part of the document tree, the root

    element will be an orphaned node.</p>

    <p>An element is said to have been <dfn title="insert an element

    into a document">inserted into a document</dfn> when its <span>root

    element</span> changes and is now the document's <span>root

    element</span>.</p>

    <p>The term <dfn>tree order</dfn> means a pre-order, depth-first

    traversal of DOM nodes involved (through the <code

    title="">parentNode</code>/<code title="">childNodes</code>

    relationship).</p>

    <p>When it is stated that some element or attribute is <dfn

    title="ignore">ignored</dfn>, or treated as some other value, or

    handled as if it was something else, this refers only to the

    processing of the node after it is in the DOM. A user agent must not

    mutate the DOM in such situations.</p>

    <p>The term <dfn>text node</dfn> refers to any <code>Text</code>

    node, including <code>CDATASection</code> nodes; specifically, any

    <code>Node</code> with node type <code title="">TEXT_NODE</code> (3)

    or <code title="">CDATA_SECTION_NODE</code> (4). <a

    href="#refsDOM3CORE">[DOM3CORE]</a></p>

    <h4>Scripting</h4>

    <p>The construction "a <code>Foo</code> object", where

    <code>Foo</code> is actually an interface, is sometimes used instead

    of the more accurate "an object implementing the interface

    <code>Foo</code>".</p>

    <p>A DOM attribute is said to be <em>getting</em> when its value is

    being retrieved (e.g. by author script), and is said to be

    <em>setting</em> when a new value is assigned to it.</p>

    <p>If a DOM object is said to be <dfn>live</dfn>, then that means

    that any attributes returning that object must always return the

    same object (not a new object each time), and the attributes and

    methods on that object must operate on the actual underlying data,

    not a snapshot of the data.</p>

    <p>The terms <em>fire</em> and <em>dispatch</em> are used

    interchangeably in the context of events, as in the DOM Events

    specifications. <a href="#refsDOM3EVENTS">[DOM3EVENTS]</a></p>

    <h4>Plugins</h4>

    <p>The term <dfn>plugin</dfn> is used to mean any content handler,

    typically a third-party content handler, for Web content types that

    are not supported by the user agent natively, or for content types

    that do not expose a DOM, that supports rendering the content as

    part of the user agent's interface.</p>

    <p class="example">One example of a plugin would be a PDF viewer

    that is instantiated in a <span>browsing context</span> when the

    user navigates to a PDF file. This would count as a plugin

    regardless of whether the party that implemented the PDF viewer

    component was the same as that which implemented the user agent

    itself. However, a PDF viewer application that launches separate

    from the user agent (as opposed to using the same interface) is not

    a plugin by this definition.</p>

    <p class="note">This specification does not define a mechanism for

    interacting with plugins, as it is expected to be user-agent- and

    platform-specific. Some UAs might opt to support a plugin mechanism

    such as the Netscape Plugin API; others might use remote content

    converters or have built-in support for certain types. <a

    href="#refsNPAPI">[NPAPI]</a></p>

    <p class="warning">Browsers should take extreme care when

    interacting with external content intended for <span

    title="plugin">plugins</span>. When third-party software is run with

    the same privileges as the user agent itself, vulnerabilities in the

    third-party software become as dangerous as those in the user

    agent.</p>

    <h3>Conformance requirements</h3>

    <p>All diagrams, examples, and notes in this specification are

    non-normative, as are all sections explicitly marked non-normative.

    Everything else in this specification is normative.</p>

    <p>The key words "MUST", "MUST NOT", "REQUIRED", <!--"SHALL", "SHALL

    NOT",--> "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and

    "OPTIONAL" in the normative parts of this document are to be

    interpreted as described in RFC2119. For readability, these words do

    not appear in all uppercase letters in this specification. <a

    href="#refsRFC2119">[RFC2119]</a></p> <!-- XXX but they should be

    marked up -->

    <p>Requirements phrased in the imperative as part of algorithms

    (such as "strip any leading space characters" or "return false and

    abort these steps") are to be interpreted with the meaning of the

    key word ("must", "should", "may", etc) used in introducing the

    algorithm.</p>

    <p>This specification describes the conformance criteria for user

    agents (relevant to implementors) and documents (relevant to

    authors and authoring tool implementors).</p>

    <p class="note">There is no implied relationship between document

    conformance requirements and implementation conformance

    requirements. User agents are not free to handle non-conformant

    documents as they please; the processing model described in this

    specification applies to implementations regardless of the

    conformity of the input documents.</p>

    <!-- put this list into its own section -->

    <p>User agents fall into several (overlapping) categories with

    different conformance requirements.</p>

    <dl>

     <dt id="interactive">Web browsers and other interactive user agents</dt>

     <dd>

      <p>Web browsers that support <span>XHTML</span> must process

      elements and attributes from the <span>HTML namespace</span> found

      in <span>XML documents</span> as described in this specification,

      so that users can interact with them, unless the semantics of

      those elements have been overridden by other specifications.</p>

      <p class="example">A conforming XHTML processor would, upon

      finding an XHTML <code>script</code> element in an XML document,

      execute the script contained in that element. However, if the

      element is found within an XSLT transformation sheet (assuming the

      UA also supports XSLT), then the processor would instead treat the

      <code>script</code> element as an opaque element that forms part

      of the transform.</p>

      <p>Web browsers that support <span title="HTML5">HTML</span> must

      process documents labeled as <code>text/html</code> as described

      in this specification, so that users can interact with them.</p>

     </dd>

     <dt id="non-interactive">Non-interactive presentation user agents</dt>

     <dd>

      <p>User agents that process HTML and XHTML documents purely to

      render non-interactive versions of them must comply to the same

      conformance criteria as Web browsers, except that they are exempt

      from requirements regarding user interaction.</p>

      <p class="note">Typical examples of non-interactive presentation

      user agents are printers (static UAs) and overhead displays

      (dynamic UAs). It is expected that most static non-interactive

      presentation user agents will also opt to <a

      href="#non-scripted">lack scripting support</a>.</p>

      <p class="example">A non-interactive but dynamic presentation UA

      would still execute scripts, allowing forms to be dynamically

      submitted, and so forth. However, since the concept of "focus" is

      irrelevant when the user cannot interact with the document, the UA

      would not need to support any of the focus-related DOM APIs.</p>

     </dd>

     <dt><dfn id="non-scripted">User agents with no scripting support</dfn></dt>

     <dd>

      <p>Implementations that do not support scripting (or which have

      their scripting features disabled entirely) are exempt from

      supporting the events and DOM interfaces mentioned in this

      specification. For the parts of this specification that are

      defined in terms of an events model or in terms of the DOM, such

      user agents must still act as if events and the DOM were

      supported.</p>

      <p class="note">Scripting can form an integral part of an

      application. Web browsers that do not support scripting, or that

      have scripting disabled, might be unable to fully convey the

      author's intent.</p>

     </dd>

     <dt>Conformance checkers</dt>

     <dd id="conformance-checkers">

      <p>Conformance checkers must verify that a document conforms to

      the applicable conformance criteria described in this

      specification. Automated conformance checkers are exempt from

      detecting errors that require interpretation of the author's

      intent (for example, while a document is non-conforming if the

      content of a <code>blockquote</code> element is not a quote,

      conformance checkers running without the input of human judgement

      do not have to check that <code>blockquote</code> elements only

      contain quoted material).</p>

      <p>Conformance checkers must check that the input document

      conforms when parsed without a <span>browsing context</span>

      (meaning that no scripts are run, and that the parser's

      <span>scripting flag</span> is disabled), and should also check

      that the input document conforms when parsed with a <span>browsing

      context</span> in which scripts execute, and that the scripts

      never cause non-conforming states to occur other than transiently

      during script execution itself. (This is only a "SHOULD" and not a

      "MUST" requirement because it has been proven to be impossible. <a

      href="#refsHALTINGPROBLEM">[HALTINGPROBLEM]</a>)</p> <!-- XXX

      [Computable] On computable numbers, with an application to the

      Entscheidungsproblem. Alan M. Turing. In Proceedings of the London

      Mathematical Society, series 2, volume 42, pages 230-265. London

      Mathematical Society,

      1937. http://www.turingarchive.org/browse.php/B/12 (referenced:

      2007-03-03) -->

      <p>The term "HTML5 validator" can be used to refer to a

      conformance checker that itself conforms to the applicable

      requirements of this specification.</p>

      <div class="note">

       <p>XML DTDs cannot express all the conformance requirements of

       this specification. Therefore, a validating XML processor and a

       DTD cannot constitute a conformance checker. Also, since neither

       of the two authoring formats defined in this specification are

       applications of SGML, a validating SGML system cannot constitute

       a conformance checker either.</p>

       <p>To put it another way, there are three types of conformance

       criteria:</p>

       <ol>

        <li>Criteria that can be expressed in a DTD.</li>

        <li>Criteria that cannot be expressed by a DTD, but can still be

        checked by a machine.</li>

        <li>Criteria that can only be checked by a human.</li>

       </ol>

       <p>A conformance checker must check for the first two. A simple

       DTD-based validator only checks for the first class of errors and

       is therefore not a conforming conformance checker according to

       this specification.</p>

      </div>

     </dd>

     <dt>Data mining tools</dt>

     <dd id="data-mining">

      <p>Applications and tools that process HTML and XHTML documents

      for reasons other than to either render the documents or check

      them for conformance should act in accordance to the semantics of

      the documents that they process.</p>

      <p class="example">A tool that generates <span

      title="outline">document outlines</span> but increases the nesting

      level for each paragraph and does not increase the nesting level

      for each section would not be conforming.</p>

     </dd>

     <dt id="editors">Authoring tools and markup generators</dt>

     <dd>

      <p>Authoring tools and markup generators must generate conforming

      documents. Conformance criteria that apply to authors also apply

      to authoring tools, where appropriate.</p>

      <p>Authoring tools are exempt from the strict requirements of

      using elements only for their specified purpose, but only to the

      extent that authoring tools are not yet able to determine author

      intent.</p>

      <p class="example">For example, it is not conforming to use an

      <code>address</code> element for arbitrary contact information;

      that element can only be used for marking up contact information

      for the author of the document or section. However, since an

      authoring tool is likely unable to determine the difference, an

      authoring tool is exempt from that requirement.</p>

      <p class="note">In terms of conformance checking, an editor is

      therefore required to output documents that conform to the same

      extent that a conformance checker will verify.</p>

      <p>When an authoring tool is used to edit a non-conforming

      document, it may preserve the conformance errors in sections of

      the document that were not edited during the editing session

      (i.e. an editing tool is allowed to round-trip erroneous

      content). However, an authoring tool must not claim that the

      output is conformant if errors have been so preserved.</p>

      <p>Authoring tools are expected to come in two broad varieties:

      tools that work from structure or semantic data, and tools that

      work on a What-You-See-Is-What-You-Get media-specific editing

      basis (WYSIWYG).</p>

      <p>The former is the preferred mechanism for tools that author

      HTML, since the structure in the source information can be used to

      make informed choices regarding which HTML elements and attributes

      are most appropriate.</p>

      <p>However, WYSIWYG tools are legitimate. WYSIWYG tools should use

      elements they know are appropriate, and should not use elements

      that they do not know to be appropriate. This might in certain

      extreme cases mean limiting the use of flow elements to just a few

      elements, like <code>div</code>, <code>b</code>, <code>i</code>,

      and <code>span</code> and making liberal use of the <code

      title="attr-style">style</code> attribute.</p>

      <p>All authoring tools, whether WYSIWYG or not, should make a best

      effort attempt at enabling users to create well-structured,

      semantically rich, media-independent content.</p>

     </dd>

    </dl>

    <p>Some conformance requirements are phrased as requirements on

    elements, attributes, methods or objects. Such requirements fall

    into two categories: those describing content model restrictions,

    and those describing implementation behavior. The former category

    of requirements are requirements on documents and authoring

    tools. The second category are requirements on user agents.</p>

    <p>Conformance requirements phrased as algorithms or specific steps

    may be implemented in any manner, so long as the end result is

    equivalent. (In particular, the algorithms defined in this

    specification are intended to be easy to follow, and not intended to

    be performant.)</p>

    <p id="hardwareLimitations">User agents may impose

    implementation-specific limits on otherwise unconstrained inputs,

    e.g. to prevent denial of service attacks, to guard against running

    out of memory, or to work around platform-specific limitations.</p>

    <p>For compatibility with existing content and prior specifications,

    this specification describes two authoring formats: one based on XML

    (referred to as <dfn id="xhtml5" title="XHTML">XHTML5</dfn>), and

    one using a <a href="#parsing">custom format</a> inspired by SGML

    (referred to as <dfn id="html5">HTML5</dfn>). Implementations may

    support only one of these two formats, although supporting both is

    encouraged.</p>

    <p id="authors-using-xhtml"><span>XHTML</span> documents (<span>XML

    documents</span> using elements from the <span>HTML

    namespace</span>) that use the new features described in this

    specification and that are served over the wire (e.g. by HTTP) must

    be sent using an XML MIME type such as <code>application/xml</code>

    or <code>application/xhtml+xml</code> and must not be served as

    <code>text/html</code>. <a href="#refsRFC3023">[RFC3023]</a></p>

    <p>Such XML documents may contain a <code>DOCTYPE</code> if desired,

    but this is not required to conform to this specification.</p>

    <p class="note">According to the XML specification, XML processors

    are not guaranteed to process the external DTD subset referenced in

    the DOCTYPE. This means, for example, that using entity references

    for characters in XHTML documents is unsafe (except for <code

    title="">&amp;lt;</code>, <code title="">&amp;gt;</code>, <code

    title="">&amp;amp;</code>, <code title="">&amp;quot;</code> and

    <code title="">&amp;apos;</code>).</p>

    <p id="authors-using-html"><span title="HTML5">HTML

    documents</span>, if they are served over the wire (e.g. by HTTP)

    must be labeled with the <code>text/html</code> MIME type.</p> <!--

    XXX update RFC 2854 -->

    <p id="entity-references">The language in this specification assumes

    that the user agent expands all entity references, and therefore

    does not include entity reference nodes in the DOM. If user agents

    do include entity reference nodes in the DOM, then user agents must

    handle them as if they were fully expanded when implementing this

    specification. For example, if a requirement talks about an

    element's child text nodes, then any text nodes that are children of

    an entity reference that is a child of that element would be used as

    well. Entity references to unknown entities must be treated as if

    they contained just an empty text node for the purposes of the

    algorithms defined in this specification.</p>

    <h4>Dependencies</h4>

    <p>This specification relies on several other underlying

    specifications.</p>

    <dl>

     <dt>XML</dt>

     <dd>

      <p>Implementations that support XHTML5 must support some version

      of XML, as well as its corresponding namespaces specification,

      because XHTML5 uses an XML serialization with namespaces. <a

      href="#refsXML">[XML]</a> <a

      href="#refsXMLNAMES">[XMLNAMES]</a></p>

     </dd>

     <dt>DOM</dt>

     <dd>

      <p>The Document Object Model (DOM) is a representation &mdash; a

      model &mdash; of a document and its content. The DOM is not just

      an API; the conformance criteria of HTML implementations are

      defined, in this specification, in terms of operations on the DOM.

      <a href="#refsDOM3CORE">[DOM3CORE]</a></p>

      <p>Implementations must support some version of DOM Core and DOM

      Events, because this specification is defined in terms of the DOM,

      and some of the features are defined as extensions to the DOM Core

      interfaces. <a href="#refsDOM3CORE">[DOM3CORE]</a> <a

      href="#refsDOM3CORE">[DOM3EVENTS]</a></p>

     </dd>

     <dt>ECMAScript</dt>

     <dd>

      <p>Implementations that use ECMAScript to implement the APIs

      defined in this specification must implement them in a manner

      consistent with the ECMAScript Bindings defined in the Web IDL

      specification, as this specification uses that specification's

      terminology. <a href="#refsWebIDL">[WebIDL]</a></p>

     </dd>

     <dt id="mq">Media Queries</dt>

     <dd>

      <p>Implementations must support some version of the Media Queries

      language. <a href="#refsMQ">[MQ]</a></p>

     </dd>

    </dl>

    <p>This specification does not require support of any particular

    network transport protocols, style sheet language, scripting

    language, or any of the DOM and WebAPI specifications beyond those

    described above. However, the language described by this

    specification is biased towards CSS as the styling language,

    ECMAScript as the scripting language, and HTTP as the network

    protocol, and several features assume that those languages and

    protocols are in use.</p>

    <p class="note">This specification might have certain additional

    requirements on character encodings, image formats, audio formats,

    and video formats in the respective sections.</p>

    <h4>Features defined in other specifications</h4>

    <p class="big-issue">this section will be removed at some point</p>

    <p>Some elements are defined in terms of their DOM

    <dfn><code>textContent</code></dfn> attribute. This is an attribute

    defined on the <code>Node</code> interface in DOM3 Core. <a

    href="#refsDOM3CORE">[DOM3CORE]</a></p>

    <!-- This section is currently here exclusively so that we crossref

    to textContent. XXX also add event-click, event-change,

    event-DOMActivate, etc, here, once DOM3 Events is ready for that,

    and just have the section be a general "defined in other

    specifications" section -->

    <p>The interface <dfn><code>DOMTimeStamp</code></dfn> is defined in

    DOM3 Core. <a href="#refsDOM3CORE">[DOM3CORE]</a></p>

    <p>The term <dfn>activation behavior</dfn> is used as defined in the

    DOM3 Events specification. <a

    href="#refsDOM3EVENTS">[DOM3EVENTS]</a> <span class="big-issue">At

    the time of writing, DOM3 Events hadn't yet been updated to define

    that phrase.</span></p>

    <p id="alternate-style-sheets">The rules for handling alternative

    style sheets are defined in the CSS object model specification. <a

    href="#refsCSSOM">[CSSOM]</a></p>

    <p class="big-issue">See <a

    href="http://dev.w3.org/cvsweb/~checkout~/csswg/cssom/Overview.html?content-type=text/html;%20charset=utf-8">http://dev.w3.org/cvsweb/~checkout~/csswg/cssom/Overview.html?content-type=text/html;%20charset=utf-8</a></p>

    <h4>Common conformance requirements for APIs exposed to

    JavaScript</h4>

    <p class="big-issue">This section will eventually be removed in favour of WebIDL.</p>

    <p class="big-issue">A lot of arrays/lists/<span title="collections">collection</span>s

    in this spec assume zero-based indexes but use the term "<var

    title="">index</var>th" liberally. We should define those to be

    zero-based and be clearer about this.</p>

    <p>Unless otherwise specified, if a DOM attribute that is a floating

    point number type (<code title="">float</code>) is assigned an

    Infinity or Not-a-Number value, a <code

    title="big-issue">NOT_SUPPORTED_ERR</code> exception must be

    raised.</p>

    <p>Unless otherwise specified, if a method with an argument that is a

    floating point number type (<code title="">float</code>) is passed

    an Infinity or Not-a-Number value, a <code

    title="big-issue">NOT_SUPPORTED_ERR</code> exception must be

    raised.</p>

    <!-- XXX DOMB -->

    <p>Unless otherwise specified, if a method is passed fewer

    arguments than is defined for that method in its IDL definition,

    a <code title="big-issue">NOT_SUPPORTED_ERR</code> exception must be

    raised.</p>

    <!-- XXX DOMB -->

    <p>Unless otherwise specified, if a method is passed more arguments than

    is defined for that method in its IDL definition, the excess

    arguments must be ignored.</p>

    <h3>Case-sensitivity</h3>

    <p>This specification defines several comparison operators for

    strings.</p>

    <p>Comparing two strings in a <dfn>case-sensitive</dfn> manner means

    comparing them exactly, codepoint for codepoint.</p>

    <p>Comparing two strings in a <dfn>ASCII case-insensitive</dfn>

    manner means comparing them exactly, codepoint for codepoint, except

    that the characters in the range U+0041 .. U+005A (i.e. LATIN

    CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding

    characters in the range U+0061 .. U+007A (i.e. LATIN SMALL LETTER A

    to LATIN SMALL LETTER Z) are considered to also match.</p>

    <p>Comparing two strings in a <dfn>compatibility caseless</dfn>

    manner means using the Unicode <i>compatibility caseless match</i>

    operation to compare the two strings. <a

    href="#refsUNICODECASE">[UNICODECASE]</a></p> <!-- XXX refs to

    Unicode Standard Annex #21, Case Mappings -->

    <p><dfn title="converted to uppercase">Converting a string to

    uppercase</dfn> means replacing all characters in the range U+0061

    .. U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) with

    the corresponding characters in the range U+0041 .. U+005A

    (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z).</p>

    <p><dfn title="converted to lowercase">Converting a string to

    lowercase</dfn> means replacing all characters in the range U+0041

    .. U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z)

    with the corresponding characters in the range U+0061 .. U+007A

    (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z).</p>

    <h3>Common microsyntaxes</h3>

    <p>There are various places in HTML that accept particular data

    types, such as dates or numbers. This section describes what the

    conformance criteria for content in those formats is, and how to

    parse them.</p>

    <!-- XXX need to define how to handle U+000A LINE FEED and U+000D

    CARRIAGE RETURN in attributes (for HTML) -->

    <p class="big-issue">Need to go through the whole spec and make sure

    all the attribute values are clearly defined either in terms of

    microsyntaxes or in terms of other specs, or as "Text" or some

    such.</p>

    <h4>Common parser idioms</h4>

    <p>The <dfn title="space character">space characters</dfn>, for the

    purposes of this specification, are U+0020 SPACE, U+0009 CHARACTER

    TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and

    U+000D CARRIAGE RETURN (CR).</p>

    <p>Some of the micro-parsers described below follow the pattern of

    having an <var title="">input</var> variable that holds the string

    being parsed, and having a <var title="">position</var> variable

    pointing at the next character to parse in <var

    title="">input</var>.</p>

    <p>For parsers based on this pattern, a step that requires the user

    agent to <dfn>collect a sequence of characters</dfn> means that the

    following algorithm must be run, with <var title="">characters</var>

    being the set of characters that can be collected:</p>

    <ol>

     <li><p>Let <var title="">input</var> and <var

     title="">position</var> be the same variables as those of the same

     name in the algorithm that invoked these steps.</p></li>

     <li><p>Let <var title="">result</var> be the empty string.</p></li>

     <li><p>While <var title="">position</var> doesn't point past the

     end of <var title="">input</var> and the character at <var

     title="">position</var> is one of the <var

     title="">characters</var>, append that character to the end of <var

     title="">result</var> and advance <var title="">position</var> to

     the next character in <var title="">input</var>.</p></li>

     <li><p>Return <var title="">result</var>.</p></li>

    </ol>

    <p>The step <dfn>skip whitespace</dfn> means that the user agent

    must <span>collect a sequence of characters</span> that are <span

    title="space character">space characters</span>. The step <dfn>skip

    Zs characters</dfn> means that the user agent must <span>collect a

    sequence of characters</span> that are in the Unicode character

    class Zs. In both cases, the collected characters are not used. <a

    href="#refsUNICODE">[UNICODE]</a></p>

    <h4>Boolean attributes</h4>

    <p>A number of attributes in HTML5 are <dfn title="boolean

    attribute">boolean attributes</dfn>. The presence of a boolean

    attribute on an element represents the true value, and the absence

    of the attribute represents the false value.</p>

    <p>If the attribute is present, its value must either be the empty

    string or a value that is an <span>ASCII case-insensitive</span>

    match for the attribute's canonical name, with no leading or

    trailing whitespace.</p>

    <h4>Numbers</h4>

    <h5>Unsigned integers</h5>

    <p>A string is a <dfn>valid non-negative integer</dfn> if it

    consists of one of more characters in the range U+0030 DIGIT ZERO

    (0) to U+0039 DIGIT NINE (9).</p>

    <p>The <dfn>rules for parsing non-negative integers</dfn> are as

    given in the following algorithm. When invoked, the steps must be

    followed in the order given, aborting at the first step that returns

    a value. This algorithm will either return zero, a positive integer,

    or an error. Leading spaces are ignored. Trailing spaces and indeed

    any trailing garbage characters are ignored.</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>Let <var title="">value</var> have the value 0.</li>

     <li><p><span>Skip whitespace</span>.</p></li>

     <li><p>If <var title="">position</var> is past the end of <var

     title="">input</var>, return an error.</p></li>

     <li><p>If the next character is not one of U+0030 DIGIT ZERO (0)

     .. U+0039 DIGIT NINE (9), then return an error.</p></li>

     <!-- Ok. At this point we know we have a number. It might have

     trailing garbage which we'll ignore, but it's a number, and we

     won't return an error. -->

     <li>

      <p>If the next character is one of U+0030 DIGIT ZERO (0) .. U+0039

      DIGIT NINE (9):</p>

      <ol>

       <li>Multiply <var title="">value</var> by ten.</li>

       <li>Add the value of the current character (0..9) to <var

       title="">value</var>.</li>

       <li>Advance <var title="">position</var> to the next

       character.</li>

       <li>If <var title="">position</var> is not past the end of <var

       title="">input</var>, return to the top of step 7 in the overall

       algorithm (that's the step within which these substeps find

       themselves).</li>

      </ol>

     </li>

     <li><p>Return <var title="">value</var>.</p></li>

    </ol>

    <h5>Signed integers</h5>

    <p>A string is a <dfn>valid integer</dfn> if it consists of one of

    more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT

    NINE (9), optionally prefixed with a U+002D HYPHEN-MINUS ("-")

    character.</p>

    <p>The <dfn>rules for parsing integers</dfn> are similar to the

    rules for non-negative integers, and are as given in the following

    algorithm. When invoked, the steps must be followed in the order

    given, aborting at the first step that returns a value. This

    algorithm will either return an integer or an error. Leading spaces

    are ignored. Trailing spaces and trailing garbage characters are

    ignored.</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>Let <var title="">value</var> have the value 0.</p></li>

     <li><p>Let <var title="">sign</var> have the value

     "positive".</p></li>

     <li><p><span>Skip whitespace</span>.</p></li>

     <li><p>If <var title="">position</var> is past the end of <var

      title="">input</var>, return an error.</p></li>

     <li>

      <p>If the character indicated by <var title="">position</var> (the

      first character) is a U+002D HYPHEN-MINUS ("-") character:</p>

      <ol>

       <li>Let <var title="">sign</var> be "negative".</li>

       <li>Advance <var title="">position</var> to the next

       character.</li>

       <li>If <var title="">position</var> is past the end of <var

       title="">input</var>, return an error.</li>

      </ol>

     </li>

     <li><p>If the next character is not one of U+0030 DIGIT ZERO (0)

     .. U+0039 DIGIT NINE (9), then return an error.</p></li>

     <!-- Ok. At this point we know we have a number. It might have

     trailing garbage which we'll ignore, but it's a number, and we

     won't return an error. -->

     <li>

      <p>If the next character is one of U+0030 DIGIT ZERO (0) .. U+0039

      DIGIT NINE (9):</p>

      <ol>

       <li>Multiply <var title="">value</var> by ten.</li>

       <li>Add the value of the current character (0..9) to <var

       title="">value</var>.</li>

       <li>Advance <var title="">position</var> to the next

       character.</li>

       <li>If <var title="">position</var> is not past the end of <var

       title="">input</var>, return to the top of step 9 in the overall

       algorithm (that's the step within which these substeps find

       themselves).</li>

      </ol>

     </li>

     <li><p>If <var title="">sign</var> is "positive", return <var

     title="">value</var>, otherwise return 0-<var

     title="">value</var>.</p></li>

    </ol>

    <h5>Real numbers</h5>

    <p>A string is a <dfn>valid floating point number</dfn> if it

    consists of one of more characters in the range U+0030 DIGIT ZERO

    (0) to U+0039 DIGIT NINE (9), optionally with a single U+002E FULL

    STOP (".") character somewhere (either before these numbers, in

    between two numbers, or after the numbers), all optionally prefixed

    with a U+002D HYPHEN-MINUS ("-") character.</p>

    <p>The <dfn>rules for parsing floating point number values</dfn> are

    as given in the following algorithm. As with the previous

    algorithms, when this one is invoked, the steps must be followed in

    the order given, aborting at the first step that returns a

    value. This algorithm will either return a number or an

    error. Leading spaces are ignored. Trailing spaces and garbage

    characters are ignored.</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>Let <var title="">value</var> have the value 0.</li>

     <li><p>Let <var title="">sign</var> have the value "positive".</li>

     <li><p><span>Skip whitespace</span>.</p></li>

     <li><p>If <var title="">position</var> is past the end of <var

      title="">input</var>, return an error.</p></li>

     <li>

      <p>If the character indicated by <var title="">position</var> (the

      first character) is a U+002D HYPHEN-MINUS ("-") character:</p>

      <ol>

       <li>Let <var title="">sign</var> be "negative".</li>

       <li>Advance <var title="">position</var> to the next

       character.</li>

       <li>If <var title="">position</var> is past the end of <var

       title="">input</var>, return an error.</li>

      </ol>

     </li>

     <li><p>If the next character is not one of U+0030 DIGIT ZERO (0)

     .. U+0039 DIGIT NINE (9) or U+002E FULL STOP ("."), then return an

     error.</p></li>

     <li><p>If the next character is U+002E FULL STOP ("."), but either

     that is the last character or the character after that one is not

     one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return

     an error.</p></li>

     <!-- Ok. At this point we know we have a number. It might have

     trailing garbage which we'll ignore, but it's a number, and we

     won't return an error. -->

     <li>

      <p>If the next character is one of U+0030 DIGIT ZERO (0) .. U+0039

      DIGIT NINE (9):</p>

      <ol>

       <li>Multiply <var title="">value</var> by ten.</li>

       <li>Add the value of the current character (0..9) to <var

       title="">value</var>.</li>

       <li>Advance <var title="">position</var> to the next

       character.</li>

       <li>If <var title="">position</var> is past the end of <var

       title="">input</var>, then if <var title="">sign</var> is

       "positive", return <var title="">value</var>, otherwise return

       0-<var title="">value</var>.</li>

       <li>Otherwise return to the top of step 10 in the overall

       algorithm (that's the step within which these substeps find

       themselves).</li>

      </ol>

     </li>

     <li><p>Otherwise, if the next character is not a U+002E FULL STOP

     ("."), then if <var title="">sign</var> is "positive", return <var

     title="">value</var>, otherwise return 0-<var

     title="">value</var>.</p></li>

     <li><p>The next character is a U+002E FULL STOP ("."). Advance <var

     title="">position</var> to the character after that.</p></li>

     <li><p>Let <var title="">divisor</var> be 1.</p></li>

     <li>

      <p>If the next character is one of U+0030 DIGIT ZERO (0) .. U+0039

      DIGIT NINE (9):</p>

      <ol>

       <li>Multiply <var title="">divisor</var> by ten.</li>

       <li>Add the value of the current character (0..9) divided by <var

       title="">divisor</var>, to <var title="">value</var>.</li>

       <li>Advance <var title="">position</var> to the next

       character.</li>

       <li>If <var title="">position</var> is past the end of <var

       title="">input</var>, then if <var title="">sign</var> is

       "positive", return <var title="">value</var>, otherwise return

       0-<var title="">value</var>.</li>

       <li>Otherwise return to the top of step 14 in the overall

       algorithm (that's the step within which these substeps find

       themselves).</li>

      </ol>

     </li>

     <li><p>Otherwise, if <var title="">sign</var> is "positive", return

     <var title="">value</var>, otherwise return 0-<var

     title="">value</var>.</p></li>

    </ol>

    <h5>Ratios</h5>

    <p class="note">The algorithms described in this section are used by

    the <code>progress</code> and <code>meter</code> elements.</p>

    <p>A <dfn>valid denominator punctuation character</dfn> is one of

    the characters from the table below. There is <dfn title="values

    associated with denominator punctuation characters">a value

    associated with each denominator punctuation character</dfn>, as

    shown in the table below.</p>

    <table>

     <thead>

      <tr>

       <th colspan="2">Denominator Punctuation Character</th>

       <th>Value</th>

      </tr>

     </thead>

     <tbody>

      <tr>

       <td>U+0025 PERCENT SIGN</td>

       <td>&#x0025;</td>

       <td>100</td>

      </tr>

      <tr>

       <td>U+066A ARABIC PERCENT SIGN</td>

       <td>&#x066A;</td>

       <td>100</td>

      </tr>

      <tr>

       <td>U+FE6A SMALL PERCENT SIGN</td>

       <td>&#xFE6A;</td>

       <td>100</td>

      </tr>

      <tr>

       <td>U+FF05 FULLWIDTH PERCENT SIGN</td>

       <td>&#xFF05;</td>

       <td>100</td>

      </tr>

      <tr>

       <td>U+2030 PER MILLE SIGN</td>

       <td>&#x2030;</td>

       <td>1000</td>

      </tr>

      <tr>

       <td>U+2031 PER TEN THOUSAND SIGN</td>

       <td>&#x2031;</td>

       <td>10000</td>

      </tr>

     </tbody>

    </table>

    <p>The <dfn>steps for finding one or two numbers of a ratio in a

    string</dfn> are as follows:</p>

    <ol>

     <li>If the string is empty, then return nothing and abort these

     steps.</li>

     <li><span>Find a number</span> in the string according to the

     algorithm below, starting at the start of the string.</li>

     <li>If the sub-algorithm in step 2 returned nothing or returned an

     error condition, return nothing and abort these steps.</li>

     <li>Set <var title="">number1</var> to the number returned by the

     sub-algorithm in step 2.</li>

     <li>Starting with the character immediately after the last one

     examined by the sub-algorithm in step 2, skip any characters in the

     string that are in the Unicode character class Zs (this might match

     zero characters). <a href="#refsUNICODE">[UNICODE]</a></li>

     <li>If there are still further characters in the string, and the

     next character in the string is a <span>valid denominator

     punctuation character</span>, set <var title="">denominator</var>

     to that character.</li>

     <li>If the string contains any other characters in the range U+0030

     DIGIT ZERO to U+0039 DIGIT NINE, but <var title="">denominator</var> was

     given a value in the step 6, return nothing and abort these

     steps.</li>

     <li>Otherwise, if <var title="">denominator</var> was given a value

     in step 6, return <var title="">number1</var> and <var

     title="">denominator</var> and abort these steps.</li>

     <li><span>Find a number</span> in the string again, starting

     immediately after the last character that was examined by the

     sub-algorithm in step 2.</li>

     <li>If the sub-algorithm in step 9 returned nothing or an error

     condition, return nothing and abort these steps.</li>

     <li>Set <var title="">number2</var> to the number returned by the

     sub-algorithm in step 9.</li>

     <li>Starting with the character immediately after the last one

     examined by the sub-algorithm in step 9, skip any characters in the

     string that are in the Unicode character class Zs (this might match

     zero characters). <a href="#refsUNICODE">[UNICODE]</a></li>

     <li>If there are still further characters in the string, and the

     next character in the string is a <span>valid denominator

     punctuation character</span>, return nothing and abort these

     steps.</li>

     <li>If the string contains any other characters in the range U+0030

     DIGIT ZERO to U+0039 DIGIT NINE, return nothing and abort these

     steps.</li>

     <li>Otherwise, return <var title="">number1</var> and

     <var title="">number2</var>.</li>

    </ol>

    <!-- XXX again, this should say "positive number" -->

    <p>The algorithm to <dfn>find a number</dfn> is as follows. It is

    given a string and a starting position, and returns either nothing,

    a number, or an error condition.</p>

    <ol>

     <li>Starting at the given starting position, ignore all characters

     in the given string until the first character that is either a

     U+002E FULL STOP or one of the ten characters in the range U+0030

     DIGIT ZERO to U+0039 DIGIT NINE.</li>

     <li>If there are no such characters, return nothing and abort these

     steps.</li>

     <li>Starting with the character matched in step 1, collect all the

     consecutive characters that are either a U+002E FULL STOP or one of

     the ten characters in the range U+0030 DIGIT ZERO to U+0039 DIGIT

     NINE, and assign this string of one or more characters to

     <var title="">string</var>.</li>

     <li>If <var title="">string</var> consists of just a single U+002E

     FULL STOP character or if it contains more than one U+002E FULL

     STOP character then return an error condition and abort these

     steps.</li>

     <li>Parse <var title="">string</var> according to the <span>rules

     for parsing floating point number values</span>, to obtain <var

     title="">number</var>. This step cannot fail (<var

     title="">string</var> is guaranteed to be a <span>valid floating

     point number</span>).</li>

     <li>Return <var title="">number</var>.</li>

    </ol>

    <h5 id="percentages-and-dimensions">Percentages and dimensions</h5>

    <p class="big-issue"><dfn>valid positive non-zero integers</dfn>

    <dfn>rules for parsing dimension values</dfn> (only used by

    height/width on img, embed, object &mdash; lengths in css pixels or

    percentages)</p>

    <h5>Lists of integers</h5>

    <p>A <dfn>valid list of integers</dfn> is a number of <span

    title="valid integer">valid integers</span> separated by U+002C

    COMMA characters, with no other characters (e.g. no <span

    title="space character">space characters</span>). In addition, there

    might be restrictions on the number of integers that can be given,

    or on the range of values allowed.</p>

    <p>The <dfn>rules for parsing a list of integers</dfn> are as

    follows:</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>Let <var title="">numbers</var> be an initially empty list

     of integers. This list will be the result of this

     algorithm.</p></li>

     <li><p>If there is a character in the string <var

     title="">input</var> at position <var title="">position</var>, and

     it is either a U+0020 SPACE, U+002C COMMA, or U+003B SEMICOLON

     character, then advance <var title="">position</var> to the next

     character in <var title="">input</var>, or to beyond the end of the

     string if there are no more characters.</p></li>

     <li><p>If <var title="">position</var> points to beyond the end of

     <var title="">input</var>, return <var title="">numbers</var> and

     abort.</p></li>

     <li><p>If the character in the string <var title="">input</var> at

     position <var title="">position</var> is a U+0020 SPACE, U+002C

     COMMA, or U+003B SEMICOLON character, then return to step 4.</li>

     <li><p>Let <var title="">negated</var> be false.</p></li>

     <li><p>Let <var title="">value</var> be 0.</p></li>

     <li><p>Let <var title="">started</var> be false. This variable is

     set to true when the parser sees a number or a "<code

     title="">-</code>" character.</p></li>

     <li><p>Let <var title="">got number</var> be false. This variable

     is set to true when the parser sees a number.</p></li>

     <li><p>Let <var title="">finished</var> be false. This variable is

     set to true to switch parser into a mode where it ignores

     characters until the next separator.</p></li>

     <li><p>Let <var title="">bogus</var> be false.</p></li>

     <li><p><em>Parser:</em> If the character in the string <var

     title="">input</var> at position <var title="">position</var>

     is:</p>

      <dl class="switch">

       <dt>A U+002D HYPHEN-MINUS character</dt>

       <dd>

        <p>Follow these substeps:</p>

        <ol>

         <li>If <var title="">got number</var> is true, let <var

         title="">finished</var> be true.</li>

         <li>If <var title="">finished</var> is true, skip to the next

         step in the overall set of steps.</li>

         <li>If <var title="">started</var> is true, let <var

         title="">negated</var> be false.</li>

         <li>Otherwise, if <var title="">started</var> is false and if <var

         title="">bogus</var> is false, let <var title="">negated</var>

         be true.</li>

         <li>Let <var title="">started</var> be true.</li>

        </ol>

       </dd>

       <dt>A character in the range U+0030 DIGIT ZERO .. U+0039 DIGIT

       NINE</dt>

       <dd>

        <p>Follow these substeps:</p>

        <ol>

         <li>If <var title="">finished</var> is true, skip to the next

         step in the overall set of steps.</li>

         <li>Multiply <var title="">value</var> by ten.</li>

         <li>Add the value of the digit, interpreted in base ten, to

         <var title="">value</var>.</li>

         <li>Let <var title="">started</var> be true.</li>

         <li>Let <var title="">got number</var> be true.</li>

        </ol>

       </dd>

       <dt>A U+0020 SPACE character</dt>

       <dt>A U+002C COMMA character</dt>

       <dt>A U+003B SEMICOLON character</dt>

       <dd>

        <p>Follow these substeps:</p>

        <ol>

         <li>If <var title="">got number</var> is false, return the <var

         title="">numbers</var> list and abort. This happens if an entry

         in the list has no digits, as in "<code

         title="">1,2,x,4</code>".</li>

         <li>If <var title="">negated</var> is true, then negate <var

         title="">value</var>.</li>

         <li>Append <var title="">value</var> to the <var

         title="">numbers</var> list.</li>

         <li>Jump to step 4 in the overall set of steps.</li>

        </ol>

       </dd>

       <dt>A U+002E FULL STOP character</dt>

   <!--

   Test: http://www.hixie.ch/tests/adhoc/html/flow/image-maps/004-demo.html

   IE6 on Wine treats the following characters like this also: U+1-U+1f,

   U+21-U+2b, U+2d-U+2f, U+3a, U+3c-U+40, U+5b-U+60, U+7b-U+82,

   U+84-U+89, U+8b, U+8d, U+8f-U+99, U+9b, U+9d, U+a0-U+bf, U+d7, U+f7,

   U+1f6-U+1f9, U+218-U+24f, U+2a9-U+385, U+387, U+38b, U+38d, U+3a2,

   U+3cf, U+3d7-U+3d9, U+3db, U+3dd, U+3df, U+3e1, U+3f4-U+400, U+40d,

   U+450, U+45d, U+482-U+48f, U+4c5-U+4c6, U+4c9-U+4ca, U+4cd-U+4cf,

   U+4ec-U+4ed, U+4f6-U+4f7, U+4fa-U+530, U+557-U+560, U+588-U+5cf,

   U+5eb-U+5ef, U+5f3-U+620, U+63b-U+640, U+64b-U+670, U+6b8-U+6b9,

   U+6bf, U+6cf, U+6d4, U+6d6-U+904, U+93a-U+957, U+962-U+984,

   U+98d-U+98e, U+991-U+992, U+9a9, U+9b1, U+9b3-U+9b5, U+9ba-U+9db,

   U+9de, U+9e2-U+9ef, U+9f2-U+a04, U+a0b-U+a0e, U+a11-U+a12, U+a29,

   U+a31, U+a34, U+a37, U+a3a-U+a58, U+a5d, U+a5f-U+a84, U+a8c, U+a8e,

   U+a92, U+aa9, U+ab1, U+ab4, U+aba-U+adf, U+ae1-U+b04, U+b0d-U+b0e,

   U+b11-U+b12, U+b29, U+b31, U+b34-U+b35, U+b3a-U+b5b, U+b5e,

   U+b62-U+b84, U+b8b-U+b8d, U+b91, U+b96-U+b98, U+b9b, U+b9d,

   U+ba0-U+ba2, U+ba5-U+ba7, U+bab-U+bad, U+bb6, U+bba-U+c04, U+c0d,

   U+c11, U+c29, U+c34, U+c3a-U+c5f, U+c62-U+c84, U+c8d, U+c91, U+ca9,

   U+cb4, U+cba-U+cdd, U+cdf, U+ce2-U+d04, U+d0d, U+d11, U+d29,

   U+d3a-U+d5f, U+d62-U+e00, U+e2f, U+e31, U+e34-U+e3f, U+e46-U+e80,

   U+e83, U+e85-U+e86, U+e89, U+e8b-U+e8c, U+e8e-U+e93, U+e98, U+ea0,

   U+ea4, U+ea6, U+ea8-U+ea9, U+eac, U+eaf-U+edb, U+ede-U+109f,

   U+10c6-U+10cf, U+10f7-U+10ff, U+115a-U+115e, U+11a3-U+11a7,

   U+11fa-U+1dff, U+1e9b-U+1e9f, U+1efa-U+1eff, U+1f16-U+1f17,

   U+1f1e-U+1f1f, U+1f46-U+1f47, U+1f4e-U+1f4f, U+1f58, U+1f5a, U+1f5c,

   U+1f5e, U+1f7e-U+1f7f, U+1fb5, U+1fbd-U+1fc1, U+1fc5, U+1fcd-U+1fcf,

   U+1fd4-U+1fd5, U+1fdc-U+1fdf, U+1fed-U+1ff1, U+1ff5, U+1ffd-U+249b,

   U+24ea-U+3004, U+3006-U+3040, U+3095-U+309a, U+309f-U+30a0, U+30fb,

   U+30ff-U+3104, U+312d-U+3130, U+318f-U+4dff, U+9fa6-U+abff,

   U+d7a4-U+d7ff, U+e000-U+f8ff, U+fa2e-U+faff, U+fb07-U+fb12,

   U+fb18-U+fb1e, U+fb37, U+fb3d, U+fb3f, U+fb42, U+fb45, U+fbb2-U+fbd2,

   U+fbe9, U+fce1, U+fd3e-U+fd4f, U+fd90-U+fd91, U+fdc8-U+fdef,

   U+fdfc-U+fe7f, U+fefd-U+ff20, U+ff3b-U+ff40, U+ff5b-U+ff65, U+ffa0,

   U+ffbf-U+ffc1, U+ffc8-U+ffc9, U+ffd0-U+ffd1, U+ffd8-U+ffd9,

   U+ffdd-U+ffff

   IE7 on Win2003 treats the following characters like this also

   instead: U+1-U+1f, U+21-U+2b, U+2d-U+2f, U+3a, U+3c-U+40, U+5b-U+60,

   U+7b-U+82, U+84-U+89, U+8b, U+8d, U+8f-U+99, U+9b, U+9d, U+a0-U+a9,

   U+ab-U+b4, U+b6-U+b9, U+bb-U+bf, U+d7, U+f7, U+220-U+221,

   U+234-U+24f, U+2ae-U+2af, U+2b9-U+2ba, U+2c2-U+2df, U+2e5-U+2ed,

   U+2ef-U+344, U+346-U+379, U+37b-U+385, U+387, U+38b, U+38d, U+3a2,

   U+3cf, U+3d8-U+3d9, U+3f4-U+3ff, U+482-U+48b, U+4c5-U+4c6,

   U+4c9-U+4ca, U+4cd-U+4cf, U+4f6-U+4f7, U+4fa-U+530, U+557-U+558,

   U+55a-U+560, U+588-U+5cf, U+5eb-U+5ef, U+5f3-U+620, U+63b-U+640,

   U+656-U+66f, U+6d4, U+6dd-U+6e0, U+6e9-U+6ec, U+6ee-U+6f9,

   U+6fd-U+70f, U+72d-U+72f, U+740-U+77f, U+7b1-U+900, U+904,

   U+93a-U+93c, U+94d - U+94f, U+951-U+957, U+964-U+980, U+984,

   U+98d-U+98e, U+991-U+992, U+9a9, U+9b1, U+9b3-U+9b5, U+9ba-U+9bd,

   U+9c5-U+9c6, U+9c9-U+9ca, U+9cd-U+9d6, U+9d8-U+9db, U+9de,

   U+9e4-U+9ef, U+9f2-U+a01, U+a03-U+a04, U+a0b-U+a0e, U+a11-U+a12,

   U+a29, U+a31, U+a34, U+a37, U+a3a-U+a3d, U+a43-U+a46, U+a49-U+a4a,

   U+a4d-U+a58, U+a5d, U+a5f-U+a6f, U+a75-U+a80, U+a84, U+a8c, U+a8e,

   U+a92, U+aa9, U+ab1, U+ab4, U+aba-U+abc, U+ac6, U+aca, U+acd-U+acf,

   U+ad1-U+adf, U+ae1-U+b00, U+b04, U+b0d-U+b0e, U+b11-U+b12, U+b29,

   U+b31, U+b34-U+b35, U+b3a-U+b3c, U+b44-U+b46, U+b49 - U+b4a,

   U+b4d-U+b55, U+b58-U+b5b, U+b5e, U+b62-U+b81, U+b84, U+b8b-U+b8d,

   U+b91, U+b96-U+b98, U+b9b, U+b9d, U+ba0 - U+ba2, U+ba5-U+ba7,

   U+bab-U+bad, U+bb6, U+bba-U+bbd, U+bc3-U+bc5, U+bc9, U+bcd-U+bd6,

   U+bd8-U+c00, U+c04, U+c0d, U+c11, U+c29, U+c34, U+c3a-U+c3d, U+c45,

   U+c49, U+c4d-U+c54, U+c57-U+c5f, U+c62-U+c81, U+c84, U+c8d, U+c91,

   U+ca9, U+cb4, U+cba-U+cbd, U+cc5, U+cc9, U+ccd-U+cd4, U+cd7-U+cdd,

   U+cdf, U+ce2-U+d01, U+d04, U+d0d, U+d11, U+d29, U+d3a-U+d3d,

   U+d44-U+d45, U+d49, U+d4d-U+d56, U+d58-U+d5f, U+d62-U+d81, U+d84,

   U+d97-U+d99, U+db2, U+dbc, U+dbe - U+dbf, U+dc7-U+dce, U+dd5, U+dd7,

   U+de0-U+df1, U+df4-U+e00, U+e3b-U+e3f, U+e4f-U+e80, U+e83,

   U+e85-U+e86, U+e89, U+e8b-U+e8c, U+e8e-U+e93, U+e98, U+ea0, U+ea4,

   U+ea6, U+ea8-U+ea9, U+eac, U+eba, U+ebe-U+ebf, U+ec5-U+ecc,

   U+ece-U+edb, U+ede-U+eff, U+f01-U+f3f, U+f48, U+f6b-U+f70,

   U+f82-U+f87, U+f8c-U+f8f, U+f98, U+fbd-U+fff, U+1022, U+1028, U+102b,

   U+1033-U+1035, U+1037, U+1039-U+104f, U+105a-U+109f, U+10c6-U+10cf,

   U+10f7-U+10ff, U+115a - U+115e, U+11a3-U+11a7, U+11fa-U+11ff, U+1207,

   U+1247, U+1249, U+124e-U+124f, U+1257, U+1259, U+125e-U+125f, U+1287,

   U+1289, U+128e-U+128f, U+12af, U+12b1, U+12b6-U+12b7, U+12bf, U+12c1,

   U+12c6-U+12c7, U+12cf, U+12d7, U+12ef, U+130f, U+1311, U+1316-U+1317,

   U+131f, U+1347, U+135b-U+139f, U+13f5-U+1400, U+166d-U+166e,

   U+1677-U+1680, U+169b - U+169f, U+16eb-U+177f, U+17c9-U+181f, U+1843,

   U+1878-U+187f, U+18aa-U+1dff, U+1e9c-U+1e9f, U+1efa-U+1eff,

   U+1f16-U+1f17, U+1f1e-U+1f1f, U+1f46-U+1f47, U+1f4e-U+1f4f, U+1f58,

   U+1f5a, U+1f5c, U+1f5e, U+1f7e-U+1f7f, U+1fb5, U+1fbd, U+1fbf-U+1fc1,

   U+1fc5, U+1fcd-U+1fcf, U+1fd4-U+1fd5, U+1fdc-U+1fdf, U+1fed-U+1ff1,

   U+1ff5, U+1ffd-U+207e, U+2080-U+2101, U+2103-U+2106, U+2108-U+2109,

   U+2114, U+2116-U+2118, U+211e-U+2123, U+2125, U+2127, U+2129, U+212e,

   U+2132, U+213a-U+215f, U+2184-U+3005, U+3008-U+3020, U+302a-U+3037,

   U+303b-U+3104, U+312d-U+3130, U+318f - U+319f, U+31b8-U+33ff,

   U+4db6-U+4dff, U+9fa6-U+9fff, U+a48d-U+abff, U+d7a4-U+d7ff,

   U+e000-U+f8ff, U+fa2e-U+faff, U+fb07-U+fb12, U+fb18-U+fb1c, U+fb1e,

   U+fb29, U+fb37, U+fb3d, U+fb3f, U+fb42, U+fb45, U+fbb2-U+fbd2,

   U+fd3e-U+fd4f, U+fd90-U+fd91, U+fdc8-U+fdef, U+fdfc-U+fe6f, U+fe73,

   U+fe75, U+fefd-U+ff20, U+ff3b-U+ff40, U+ff5b-U+ff9f, U+ffbf-U+ffc1,

   U+ffc8-U+ffc9, U+ffd0-U+ffd1, U+ffd8-U+ffd9, U+ffdd-U+ffff

-->

       <dd>

        <p>Follow these substeps:</p>

        <ol>

         <li>If <var title="">got number</var> is true, let <var

         title="">finished</var> be true.</li>

         <li>If <var title="">finished</var> is true, skip to the next

         step in the overall set of steps.</li>

         <li>Let <var title="">negated</var> be false.</li>

        </ol>

       </dd>

       <dt>Any other character</dt>

       <dd>

        <p>Follow these substeps:</p>

        <ol>

         <li>If <var title="">finished</var> is true, skip to the next

         step in the overall set of steps.</li>

         <li>Let <var title="">negated</var> be false.</li>

         <li>Let <var title="">bogus</var> be true.</li>

         <li>If <var title="">started</var> is true, then return the

         <var title="">numbers</var> list, and abort. (The value in <var

         title="">value</var> is not appended to the list first; it is

         dropped.)</li>

        </ol>

       </dd>

      </dl>

     </li>

     <li><p>Advance <var title="">position</var> to the next character

     in <var title="">input</var>, or to beyond the end of the string if

     there are no more characters.</p></li>

     <li><p>If <var title="">position</var> points to a character (and

     not to beyond the end of <var title="">input</var>), jump to the

     big <em>Parser</em> step above.</p></li>

     <li><p>If <var title="">negated</var> is true, then negate <var

     title="">value</var>.</li>

     <li><p>If <var title="">got number</var> is true, then append <var

     title="">value</var> to the <var title="">numbers</var> list.</li>

     <li><p>Return the <var title="">numbers</var> list and

     abort.</p></li>

    </ol>

    <h4>Dates and times</h4>

    <p>In the algorithms below, the <dfn>number of days in month <var

    title="">month</var> of year <var title="">year</var></dfn> is:

    <em>31</em> if <var title="">month</var> is 1, 3, 5, 7, 8, 10, or

    12; <em>30</em> if <var title="">month</var> is 4, 6, 9, or 11;

    <em>29</em> if <var title="">month</var> is 2 and <var

    title="">year</var> is a number divisible by 400, or if <var

    title="">year</var> is a number divisible by 4 but not by 100; and

    <em>28</em> otherwise. This takes into account leap years in the

    Gregorian calendar. <a

    href="#refsGREGORIAN">[GREGORIAN]</a></p>

    <h5>Specific moments in time</h5>

    <p>A string is a <dfn>valid datetime</dfn> if it has four digits

    (representing the year), a literal hyphen, two digits (representing

    the month), a literal hyphen, two digits (representing the day),

    optionally some spaces, either a literal T or a space, optionally

    some more spaces, two digits (for the hour), a colon, two digits

    (the minutes), optionally the seconds (which, if included, must

    consist of another colon, two digits (the integer part of the

    seconds), and optionally a decimal point followed by one or more

    digits (for the fractional part of the seconds)), optionally some

    spaces, and finally either a literal Z (indicating the time zone is

    UTC), or, a plus sign or a minus sign followed by two digits, a

    colon, and two digits (for the sign, the hours and minutes of the

    timezone offset respectively); with the month-day combination being

    a valid date in the given year according to the Gregorian calendar,

    the hour values (<var title="">h</var>) being in the range

    0&nbsp;&le;&nbsp;<var title="">h</var>&nbsp;&le;&nbsp;23, the minute

    values (<var title="">m</var>) in the range 0&nbsp;&le;&nbsp;<var

    title="">m</var>&nbsp;&le;&nbsp;59, and the second value (<var

    title="">s</var>) being in the range 0&nbsp;&le;&nbsp;<var

    title="">h</var>&nbsp;&lt;&nbsp;60. <a

    href="#refsGREGORIAN">[GREGORIAN]</a></p>

    <!--XXX [GREGORIAN] should point to

    <dd id="refsGREGORIAN">[GREGORIAN]</dd>

    <dd>(Non-normative) <cite>Inter Gravissimas</cite>, A. Lilius, C. Clavius. Gregory XIII Papal Bulls, February 1582.</dd>

-->

    <p>The digits must be characters in the range U+0030 DIGIT ZERO (0)

    to U+0039 DIGIT NINE (9), the hyphens must be a U+002D HYPHEN-MINUS

    characters, the T must be a U+0054 LATIN CAPITAL LETTER T, the

    colons must be U+003A COLON characters, the decimal point must be a

    U+002E FULL STOP, the Z must be a U+005A LATIN CAPITAL LETTER Z, the

    plus sign must be a U+002B PLUS SIGN, and the minus U+002D (same as

    the hyphen).</p>

    <div class="example">

     <p>The following are some examples of dates written as <span

     title="valid datetime">valid datetimes</span>.</p>

     <dl>

      <dt>"<code>0037-12-13 00:00 Z</code>"</dt>

      <dd>Midnight UTC on the birthday of Nero (the Roman Emperor).</dd>

      <dt>"<code>1979-10-14T12:00:00.001-04:00</code>"</dt>

      <dd>One millisecond after noon on October 14th 1979, in the time

      zone in use on the east coast of North America during daylight

      saving time.</dd>

      <dt>"<code>8592-01-01 T 02:09 +02:09</code>"</dt>

      <dd>Midnight UTC on the 1st of January, 8592. The time zone

      associated with that time is two hours and nine minutes ahead of

      UTC.</dd>

     </dl>

     <p>Several things are notable about these dates:</p>

     <ul>

      <li>Years with fewer than four digits have to be

      zero-padded. The date "37-12-13" would not be a valid date.</li>

      <li>To unambiguously identify a moment in time prior to the

      introduction of the Gregorian calendar, the date has to be first

      converted to the Gregorian calendar from the calendar in use at

      the time (e.g. from the Julian calendar). The date of Nero's

      birth is the 15th of December 37, in the Julian Calendar, which

      is the 13th of December 37 in the Gregorian Calendar.</li> <!--

      XXX this might not be true. I can't find a reference that gives

      his birthday with an explicit statement about the calendar being

      used. However, it seems unlikely that it would be given in the

      Gregorian calendar, so I assume sites use the Julian one. -->

      <li>The time and timezone components are not optional.</li>

      <li>Dates before the year 0 or after the year 9999 can't be

      represented as a datetime in this version of HTML.</li>

      <li>Time zones differ based on daylight savings time.</li>

     </ul>

    </div>

    <p class="note">Conformance checkers can use the algorithm below to

    determine if a datetime is a valid datetime or not.</p>

    <p>To <dfn id="datetime-parser">parse a string as a datetime

    value</dfn>, a user agent must apply the following algorithm to the

    string. This will either return a time in UTC, with associated

    timezone information for round tripping or display purposes, or

    nothing, indicating the value is not a <span>valid

    datetime</span>. If at any point the algorithm says that it "fails",

    this means that it returns nothing.</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is not exactly four characters long, then fail. Otherwise,

     interpret the resulting sequence as a base-ten integer. Let that

     number be the <var title="">year</var>.</p></li>

     <li><p>If <var title="">position</var> is beyond the end of <var

     title="">input</var> or if the character at <var

     title="">position</var> is not a U+002D HYPHEN-MINUS character,

     then fail. Otherwise, move <var title="">position</var> forwards

     one character.</p></li>

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is not exactly two characters long, then fail. Otherwise,

     interpret the resulting sequence as a base-ten integer. Let that

     number be the <var title="">month</var>.</p></li>

     <li>If <var title="">month</var> is not a number in the range

     1&nbsp;&le;&nbsp;<var title="">month</var>&nbsp;&le;&nbsp;12, then fail.</li>

     <li><p>Let <var title="">maxday</var> be the <span>number of days

     in month <var title="">month</var> of year <var

     title="">year</var></span>.</p></li>

     <li><p>If <var title="">position</var> is beyond the end of <var

     title="">input</var> or if the character at <var

     title="">position</var> is not a U+002D HYPHEN-MINUS character,

     then fail. Otherwise, move <var title="">position</var> forwards

     one character.</p></li>

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is not exactly two characters long, then fail. Otherwise,

     interpret the resulting sequence as a base-ten integer. Let that

     number be the <var title="">day</var>.</p></li>

     <li><p>If <var title="">day</var> is not a number in the range

     1&nbsp;&le;&nbsp;<var title="">month</var>&nbsp;&le;&nbsp;<var

     title="">maxday</var>, then fail.</li>

     <li><p><span>Collect a sequence of characters</span> that are

     either U+0054 LATIN CAPITAL LETTER T characters or <span

     title="space character">space characters</span>. If the collected

     sequence is zero characters long, or if it contains more than one

     U+0054 LATIN CAPITAL LETTER T character, then fail.</p></li>

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is not exactly two characters long, then fail. Otherwise,

     interpret the resulting sequence as a base-ten integer. Let that

     number be the <var title="">hour</var>.</p></li>

     <li>If <var title="">hour</var> is not a number in the range

     0&nbsp;&le;&nbsp;<var title="">hour</var>&nbsp;&le;&nbsp;23, then fail.</li>

     <li><p>If <var title="">position</var> is beyond the end of <var

     title="">input</var> or if the character at <var

     title="">position</var> is not a U+003A COLON character,

     then fail. Otherwise, move <var title="">position</var> forwards

     one character.</p></li>

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is not exactly two characters long, then fail. Otherwise,

     interpret the resulting sequence as a base-ten integer. Let that

     number be the <var title="">minute</var>.</p></li>

     <li>If <var title="">minute</var> is not a number in the range

     0&nbsp;&le;&nbsp;<var title="">minute</var>&nbsp;&le;&nbsp;59, then fail.</li>

     <li><p>Let <var title="">second</var> be a string with the value

     "0".</p></li>

     <li><p>If <var title="">position</var> is beyond the end of <var

     title="">input</var>, then fail.</p></li>

     <li><p>If the character at <var title="">position</var> is a U+003A

     COLON, then:</p>

      <ol>

       <li><p>Advance <var title="">position</var> to the next character

       in <var title="">input</var>.</p></li>

       <li><p>If <var title="">position</var> is beyond the end of <var

       title="">input</var>, or at the last character in <var

       title="">input</var>, or if the next <em>two</em> characters in

       <var title="">input</var> starting at <var

       title="">position</var> are not two characters both in the range

       U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), then

       fail.</p></li>

       <li><p><span>Collect a sequence of characters</span> that are

       either characters in the range U+0030 DIGIT ZERO (0) to U+0039

       DIGIT NINE (9) or U+002E FULL STOP characters. If the collected

       sequence has more than one U+002E FULL STOP characters, or if the

       last character in the sequence is a U+002E FULL STOP character,

       then fail. Otherwise, let the collected string be <var

       title="">second</var> instead of its previous value.</p></li>

      </ol>

     </li>

     <li><p>Interpret <var title="">second</var> as a base-ten number

     (possibly with a fractional part). Let that number be <var

     title="">second</var> instead of the string version.</p></li>

     <li>If <var title="">second</var> is not a number in the range

     0&nbsp;&le;&nbsp;<var title="">second</var>&nbsp;&lt;&nbsp;60, then

     fail. (The values 60 and 61 are not allowed: leap seconds cannot be

     represented by datetime values.)</li>

     <li><p>If <var title="">position</var> is beyond the end of <var

     title="">input</var>, then fail.</p></li>

     <li><p><span>Skip whitespace</span>.</p></li>

     <li><p>If the character at <var title="">position</var> is a

     U+005A LATIN CAPITAL LETTER Z, then:</p>

      <ol>

       <li><p>Let <var title="">timezone<sub title="">hours</sub></var> be

       0.</p></li>

       <li><p>Let <var title="">timezone<sub title="">minutes</sub></var> be

       0.</p></li>

       <li><p>Advance <var title="">position</var> to the next character

       in <var title="">input</var>.</p></li>

      </ol>

     </li>

     <li><p>Otherwise, if the character at <var title="">position</var>

     is either a U+002B PLUS SIGN ("+") or a U+002D HYPHEN-MINUS ("-"),

     then:</p>

      <ol>

       <li><p>If the character at <var title="">position</var> is a

       U+002B PLUS SIGN ("+"), let <var title="">sign</var> be

       "positive". Otherwise, it's a U+002D HYPHEN-MINUS ("-"); let <var

       title="">sign</var> be "negative".</p></li>

       <li><p>Advance <var title="">position</var> to the next character

       in <var title="">input</var>.</p></li>

       <li><p><span>Collect a sequence of characters</span> in the range

       U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

       sequence is not exactly two characters long, then

       fail. Otherwise, interpret the resulting sequence as a base-ten

       integer. Let that number be the <var

       title="">timezone<sub title="">hours</sub></var>.</p></li>

       <li>If <var title="">timezone<sub title="">hours</sub></var> is not a

       number in the range 0&nbsp;&le;&nbsp;<var

       title="">timezone<sub title="">hours</sub></var>&nbsp;&le;&nbsp;23, then

       fail.</li>

       <li>If <var title="">sign</var> is "negative", then negate <var

       title="">timezone<sub title="">hours</sub></var>.</li>

       <li><p>If <var title="">position</var> is beyond the end of <var

       title="">input</var> or if the character at <var

       title="">position</var> is not a U+003A COLON character, then

       fail. Otherwise, move <var title="">position</var> forwards one

       character.</p></li>

       <li><p><span>Collect a sequence of characters</span> in the range

       U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

       sequence is not exactly two characters long, then

       fail. Otherwise, interpret the resulting sequence as a base-ten

       integer. Let that number be the <var

       title="">timezone<sub title="">minutes</sub></var>.</p></li>

       <li>If <var title="">timezone<sub title="">minutes</sub></var> is not a

       number in the range 0&nbsp;&le;&nbsp;<var

       title="">timezone<sub title="">minutes</sub></var>&nbsp;&le;&nbsp;59, then

       fail.</li>

       <li>If <var title="">sign</var> is "negative", then negate <var

       title="">timezone<sub title="">minutes</sub></var>.</li>

      </ol>

     </li>

     <li><p>If <var title="">position</var> is <em>not</em> beyond the

     end of <var title="">input</var>, then fail.</p></li>

     <li><p>Let <var title="">time</var> be the moment in time at year

     <var title="">year</var>, month <var title="">month</var>, day <var

     title="">day</var>, hours <var title="">hour</var>, minute <var

     title="">minute</var>, second <var title="">second</var>,

     subtracting <var title="">timezone<sub title="">hours</sub></var>

     hours and <var title="">timezone<sub title="">minutes</sub></var>

     minutes. That moment in time is a moment in the UTC

     timezone.</p></li>

     <li><p>Let <var title="">timezone</var> be <var

     title="">timezone<sub title="">hours</sub></var> hours and <var

     title="">timezone<sub title="">minutes</sub></var> minutes from

     UTC.</p></li>

     <li><p>Return <var title="">time</var> and <var

     title="">timezone</var>.</p></li>

    </ol>

    <h5>Vaguer moments in time</h5>

    <p>This section defines <dfn title="date or time string">date or

    time strings</dfn>. There are two kinds, <dfn title="date or time

    string in content">date or time strings in content</dfn>, and <dfn

    title="date or time string in attributes">date or time strings in

    attributes</dfn>. The only difference is in the handling of

    whitespace characters.</p>

    <p>To parse a <span>date or time string</span>, user agents must use

    the following algorithm. A <span>date or time string</span> is a

    <em>valid</em> date or time string if the following algorithm, when

    run on the string, doesn't say the string is invalid.</p>

    <p>The algorithm may return nothing (in which case the string will

    be invalid), or it may return a date, a time, a date and a time, or

    a date and a time and a timezone. Even if the algorithm returns

    one or more values, the string can still be invalid.</p>

    <ol>

     <!-- INIT -->

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>Let <var title="">results</var> be the collection of results

     that are to be returned (one or more of a date, a time, and a

     timezone), initially empty. If the algorithm aborts at any point,

     then whatever is currently in <var title="">results</var> must be

     returned as the result of the algorithm.</p></li>

     <!-- LEADING WHITESPACE -->

     <li><p>For the "in content" variant: <span>skip Zs

     characters</span>; for the "in attributes" variant: <span>skip

     whitespace</span>.</p></li><!-- XXX skip whitespace in attribute?

     really? -->

     <!-- YEAR or HOUR -->

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is empty, then the string is invalid; abort these

     steps.</p></li>

     <li><p>Let the sequence of characters collected in the last step be

     <var title="">s</var>.</p></li>

     <li><p>If <var title="">position</var> is past the end of <var

     title="">input</var>, the string is invalid; abort these

     steps.</p></li>

     <li><p>If the character at <var title="">position</var> is

     <em>not</em> a U+003A COLON character, then:</p>

      <!-- DATE -->

      <ol>

       <li><p>If the character at <var title="">position</var> is not a

       U+002D HYPHEN-MINUS ("-") character either, then the string is

       invalid, abort these steps.</p></li>

       <!-- YEAR -->

       <li><p>If the sequence <var title="">s</var> is not exactly four

       digits long, then the string is invalid. (This does not stop the

       algorithm, however.)</p></li>

       <li><p>Interpret the sequence of characters collected in step 5 as

       a base-ten integer, and let that number be <var

       title="">year</var>.</p></li>

       <li><p>Advance <var title="">position</var> past the U+002D

       HYPHEN-MINUS ("-") character.</p></li>

       <!-- MONTH -->

       <li><p><span>Collect a sequence of characters</span> in the range

       U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

       sequence is empty, then the string is invalid; abort these

       steps.</p></li>

       <li><p>If the sequence collected in the last step is not exactly

       two digits long, then the string is invalid.</p></li>

       <li><p>Interpret the sequence of characters collected two steps ago

       as a base-ten integer, and let that number be <var

       title="">month</var>.</p></li>

       <li>If <var title="">month</var> is not a number in the range

       1&nbsp;&le;&nbsp;<var title="">month</var>&nbsp;&le;&nbsp;12, then

       the string is invalid, abort these steps.</li>

       <li><p>Let <var title="">maxday</var> be the <span>number of days

       in month <var title="">month</var> of year <var

       title="">year</var></span>.</p></li>

       <li><p>If <var title="">position</var> is past the end of <var

       title="">input</var>, or if the character at <var

       title="">position</var> is <em>not</em> a U+002D HYPHEN-MINUS ("-")

       character, then the string is invalid, abort these

       steps. Otherwise, advance <var title="">position</var> to the next

       character.</p></li>

       <!-- DAY -->

       <li><p><span>Collect a sequence of characters</span> in the range

       U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

       sequence is empty, then the string is invalid; abort these

       steps.</p></li>

       <li><p>If the sequence collected in the last step is not exactly

       two digits long, then the string is invalid.</p></li>

       <li><p>Interpret the sequence of characters collected two steps

       ago as a base-ten integer, and let that number be <var

       title="">day</var>.</p></li>

       <li><p>If <var title="">day</var> is not a number in the range

       1&nbsp;&le;&nbsp;<var title="">day</var>&nbsp;&le;&nbsp;<var

       title="">maxday</var>, then the string is invalid, abort these

       steps.</p></li>

       <li><p>Add the date represented by <var title="">year</var>, <var

       title="">month</var>, and <var title="">day</var> to the <var

       title="">results</var>.</p></li>

       <!-- XXX we should allow the algorithm to abort here without

       error, with just a date. -->

       <!-- WHITESPACE -->

       <li><p>For the "in content" variant: <span>skip Zs

       characters</span>; for the "in attributes" variant: <span>skip

       whitespace</span>.</p></li>

       <li><p>If the character at <var title="">position</var> is a U+0054

       LATIN CAPITAL LETTER T, then move <var title="">position</var>

       forwards one character.</p></li>

       <li><p>For the "in content" variant: <span>skip Zs

       characters</span>; for the "in attributes" variant: <span>skip

       whitespace</span>.</p></li>

       <!-- at this point, if <var title="">position</var> points to a

       number, we know that we passed at least one space or a T, because

       otherwise the number would have been slurped up in the last

       "collect" step. -->

       <!-- HOUR -->

       <li><p><span>Collect a sequence of characters</span> in the range

       U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

       sequence is empty, then the string is invalid; abort these

       steps.</p></li>

       <li><p>Let <var title="">s</var> be the sequence of characters

       collected in the last step.</p></li>

      </ol>

     </li>

     <!-- TIME -->

     <li><p>If <var title="">s</var> is not exactly two digits long,

     then the string is invalid.</p></li>

     <li><p>Interpret the sequence of characters collected two steps

     ago as a base-ten integer, and let that number be <var

     title="">hour</var>.</p></li>

     <li><p>If <var title="">hour</var> is not a number in the range

     0&nbsp;&le;&nbsp;<var title="">hour</var>&nbsp;&le;&nbsp;23, then

     the string is invalid, abort these steps.</p></li>

     <li><p>If <var title="">position</var> is past the end of <var

     title="">input</var>, or if the character at <var

     title="">position</var> is <em>not</em> a U+003A COLON character,

     then the string is invalid, abort these steps. Otherwise, advance

     <var title="">position</var> to the next character.</p></li>

     <!-- MINUTE -->

     <li><p><span>Collect a sequence of characters</span> in the range

     U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

     sequence is empty, then the string is invalid; abort these

     steps.</p></li>

     <li><p>If the sequence collected in the last step is not exactly

     two digits long, then the string is invalid.</p></li>

     <li><p>Interpret the sequence of characters collected two steps

     ago as a base-ten integer, and let that number be <var

     title="">minute</var>.</p></li>

     <li><p>If <var title="">minute</var> is not a number in the range

     0&nbsp;&le;&nbsp;<var title="">minute</var>&nbsp;&le;&nbsp;59, then

     the string is invalid, abort these steps.</p></li>

     <!-- SECOND -->

     <li><p>Let <var title="">second</var> be 0. It might be changed to

     another value in the next step.</p></li>

     <li><p>If <var title="">position</var> is not past the end of <var

     title="">input</var> and the character at <var

     title="">position</var> is a U+003A COLON character, then:</p>

      <ol>

       <li><p><span>Collect a sequence of characters</span> that are

       either characters in the range U+0030 DIGIT ZERO (0) to U+0039

       DIGIT NINE (9) or are U+002E FULL STOP. If the collected sequence

       is empty, or contains more than one U+002E FULL STOP character,

       then the string is invalid; abort these steps.</p></li>

       <li><p>If the first character in the sequence collected in the

       last step is not in the range U+0030 DIGIT ZERO (0) to U+0039

       DIGIT NINE (9), then the string is invalid.</p></li>

       <li><p>Interpret the sequence of characters collected two steps

       ago as a base-ten number (possibly with a fractional part), and

       let that number be <var title="">second</var>.</p></li>

       <li><p>If <var title="">second</var> is not a number in the range

       0&nbsp;&le;&nbsp;<var title="">minute</var>&nbsp;&lt;&nbsp;60,

       then the string is invalid, abort these steps.</p></li>

      </ol>

     </li>

     <li><p>Add the time represented by <var title="">hour</var>, <var

     title="">minute</var>, and <var title="">second</var> to the <var

     title="">results</var>.</p></li>

     <!-- TIME ZONE -->

     <li><p>If <var title="">results</var> has both a date and a time,

     then:</p>

      <ol>

       <li><p>For the "in content" variant: <span>skip Zs

       characters</span>; for the "in attributes" variant: <span>skip

       whitespace</span>.</p></li>

       <li><p>If <var title="">position</var> is past the end of <var

       title="">input</var>, then skip to the next step in the overall

       set of steps.</p>

       <!-- UTC -->

       <li><p>Otherwise, if the character at <var

       title="">position</var> is a U+005A LATIN CAPITAL LETTER Z,

       then:</p>

        <ol>

         <li><p>Add the timezone corresponding to UTC (zero offset) to

         the <var title="">results</var>.</p></li>

         <li><p>Advance <var title="">position</var> to the next character

         in <var title="">input</var>.</p></li>

         <li><p>Skip to the next step in the overall set of

         steps.</p></li>

        </ol>

       </li>

       <!-- EXPLICIT TIMEZONE OFFSET -->

       <li><p>Otherwise, if the character at <var

       title="">position</var> is either a U+002B PLUS SIGN ("+") or a

       U+002D HYPHEN-MINUS ("-"), then:</p>

        <ol>

         <!-- SIGN -->

         <li><p>If the character at <var title="">position</var> is a

         U+002B PLUS SIGN ("+"), let <var title="">sign</var> be

         "positive". Otherwise, it's a U+002D HYPHEN-MINUS ("-"); let

         <var title="">sign</var> be "negative".</p></li>

         <!-- HOURS -->

         <li><p>Advance <var title="">position</var> to the next

         character in <var title="">input</var>.</p></li>

         <li><p><span>Collect a sequence of characters</span> in the

         range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the

         collected sequence is not exactly two characters long, then

         the string is invalid.</p></li>

         <li><p>Interpret the sequence collected in the last step as a

         base-ten number, and let that number be <var

         title="">timezone<sub title="">hours</sub></var>.</p></li>

         <li>If <var title="">timezone<sub title="">hours</sub></var> is not a

         number in the range 0&nbsp;&le;&nbsp;<var

         title="">timezone<sub title="">hours</sub></var>&nbsp;&le;&nbsp;23, then

         the string is invalid; abort these steps.</li>

         <li>If <var title="">sign</var> is "negative", then negate <var

         title="">timezone<sub title="">hours</sub></var>.</li>

         <li><p>If <var title="">position</var> is beyond the end of

         <var title="">input</var> or if the character at <var

         title="">position</var> is not a U+003A COLON character, then

         the string is invalid; abort these steps. Otherwise, move <var

         title="">position</var> forwards one character.</p></li>

         <!-- MINUTES -->

         <li><p><span>Collect a sequence of characters</span> in the range

         U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected

         sequence is not exactly two characters long, then the string is invalid.</p></li>

         <li><p>Interpret the sequence collected in the last step as a

         base-ten number, and let that number be <var

         title="">timezone<sub title="">minutes</sub></var>.</p></li>

         <li>If <var title="">timezone<sub title="">minutes</sub></var> is not a

         number in the range 0&nbsp;&le;&nbsp;<var

         title="">timezone<sub title="">minutes</sub></var>&nbsp;&le;&nbsp;59,

         then the string is invalid; abort these steps.</li>

         <li><p>Add the timezone corresponding to an offset of <var

         title="">timezone<sub title="">hours</sub></var> hours and <var

         title="">timezone<sub title="">minutes</sub></var> minutes to the <var

         title="">results</var>.</p></li>

         <li><p>Skip to the next step in the overall set of

         steps.</p></li>

        </ol>

       </li>

       <li><p>Otherwise, the string is invalid; abort these

       steps.</p></li>

      </ol>

     </li>

     <li><p>For the "in content" variant: <span>skip Zs

     characters</span>; for the "in attributes" variant: <span>skip

     whitespace</span>.</p></li>

     <li><p>If <var title="">position</var> is <em>not</em> past the end

     of <var title="">input</var>, then the string is invalid.</p>

     <li><p>Abort these steps (the string is parsed).</p></li>

    </ol>

    <h4>Time offsets</h4>

    <p class="big-issue"><dfn>valid time offset</dfn>, <dfn>rules for

    parsing time offsets</dfn>, <dfn>time offset serialization

    rules</dfn>; in the format "5d4h3m2s1ms" or "3m 9.2s" or "00:00:00.00"

    or similar.</p>

    <h4>Tokens</h4>

    <p>A <dfn>set of space-separated tokens</dfn> is a set of zero or

    more words separated by one or more <span title="space

    character">space characters</span>, where words consist of any

    string of one or more characters, none of which are <span

    title="space character">space characters</span>.</p>

    <p>A string containing a <span>set of space-separated tokens</span>

    may have leading or trailing <span title="space character">space

    characters</span>.</p>

    <p>An <dfn>unordered set of unique space-separated tokens</dfn> is a

    <span>set of space-separated tokens</span> where none of the words

    are duplicated.</p>

    <p>An <dfn>ordered set of unique space-separated tokens</dfn> is a

    <span>set of space-separated tokens</span> where none of the words

    are duplicated but where the order of the tokens is meaningful.</p>

    <p><span title="set of space-separated tokens">Sets of

    space-separated tokens</span> sometimes have a defined set of

    allowed values. When a set of allowed values is defined, the tokens

    must all be from that list of allowed values; other values are

    non-conforming. If no such set of allowed values is provided, then

    all values are conforming.</p>

    <p>When a user agent has to <dfn>split a string on spaces</dfn>, it

    must use the following algorithm:</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     parsed.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>Let <var title="">tokens</var> be a list of tokens,

     initially empty.</p></li>

     <li><p><span>Skip whitespace</span></p></li>

     <li><p>While <var title="">position</var> is not past the end of

     <var title="">input</var>:</p>

      <ol>

       <li><p><span>Collect a sequence of characters</span> that are not

       <span title="space character">space characters</span>.</p></li>

       <li><p>Add the string collected in the previous step to <var

       title="">tokens</var>.</p></li>

       <li><p><span>Skip whitespace</span></p></li>

      </ol>

     </li>

     <li><p>Return <var title="">tokens</var>.</p></li>

    </ol>

    <p>When a user agent has to <dfn>remove a token from a string</dfn>,

    it must use the following algorithm:</p>

    <ol>

     <li><p>Let <var title="">input</var> be the string being

     modified.</p></li>

     <li><p>Let <var title="">token</var> be the token being removed. It

     will not contain any <span title="space character">space

     characters</span>.</p></li>

     <li><p>Let <var title="">output</var> be the output string,

     initially empty.</p></li>

     <li><p>Let <var title="">position</var> be a pointer into <var

     title="">input</var>, initially pointing at the start of the

     string.</p></li>

     <li><p>If <var title="">position</var> is beyond the end of <var

     title="">input</var>, set the string being modified to <var

     title="">output</var>, and abort these steps.</p></li>

     <li><p>If the character at <var title="">position</var> is a

     <span>space character</span>:

      <ol>

       <li><p>Append the character at <var title="">position</var> to

       the end of <var title="">output</var>.</p></li>

       <li><p>Increment <var title="">position</var> so it points at the

       next character in <var title="">input</var>.</p></li>

       <li><p>Return to step 5 in the overall set of steps.</p></li>

      </ol>

     </li>

     <li><p>Otherwise, the character at <var title="">position</var> is

     the first character of a token. <span>Collect a sequence of

     characters</span> that are not <span title="space character">space

     characters</span>, and let that be <var title="">s</var>.</p></li>

     <li><p>If <var title="">s</var> is exactly equal to <var

     title="">token</var>, then:</p>

      <ol>

       <li><p><span>Skip whitespace</span> (in <var

       title="">input</var>).</p></li>

       <li><p>Remove any <span title="space character">space

       characters</span> currently at the end of <var

       title="">output</var>.</p></li>

       <li><p>If <var title="">position</var> is not past the end of

       <var title="">input</var>, and <var title="">output</var> is not

       the empty string, append a single U+0020 SPACE character at the

       end of <var title="">output</var>.</p></li>

      </ol>

     </li>

     <li><p>Otherwise, append <var title="">s</var> to the end of <var

     title="">output</var>.</p></li>

     <li><p>Return to step 6 in the overall set of steps.</p></li>

    </ol>

    <p class="note">This causes any occurrences of the token to be

    removed from the string, and any spaces that were surrounding the

    token to be collapsed to a single space, except at the start and end

    of the string, where such spaces are removed.</p>

    <h4>Keywords and enumerated attributes</h4>

    <p>Some attributes are defined as taking one of a finite set of

    keywords. Such attributes are called <dfn title="enumerated

    attribute">enumerated attributes</dfn>. The keywords are each

    defined to map to a particular <em>state</em> (several keywords

    might map to the same state, in which case some of the keywords are

    synonyms of each other; additionally, some of the keywords can be

    said to be non-conforming, and are only in the specification for

    historical reasons). In addition, two default states can be

    given. The first is the <em>invalid value default</em>, the second

    is the <em>missing value default</em>.</p>

    <p>If an enumerated attribute is specified, the attribute's value

    must be an <span>ASCII case-insensitive</span> match for one of the

    given keywords that are not said to be non-conforming, with no

    leading or trailing whitespace.</p>

    <p>When the attribute is specified, if its value is an <span>ASCII

    case-insensitively</span> match for one of the given keywords then

    that keyword's state is the state that the attribute represents. If

    the attribute value matches none of the given keywords, but the

    attribute has an <em>invalid value default</em>, then the attribute

    represents that state. Otherwise, if the attribute value matches

    none of the keywords but there is a <em>missing value default</em>

    state defined, then <em>that</em> is the state represented by the

    attribute. Otherwise, there is no default, and invalid values must

    be ignored.</p>

    <p>When the attribute is <em>not</em> specified, if there is a

    <em>missing value default</em> state defined, then that is the state

    represented by the (missing) attribute. Otherwise, the absence of

    the attribute means that there is no state represented.</p>

    <p class="note">The empty string can be one of the keywords in some

    cases. For example the <code

    title="attr-contenteditable">contenteditable</code> attribute has

    two states: <em>true</em>, matching the <code title="">true</code>

    keyword and the empty string, <em>false</em>, matching <code

    title="">false</code> and all other keywords (it's the <em>invalid

    value default</em>). It could further be thought of as having a

    third state <em>inherit</em>, which would be the default when the

    attribute is not specified at all (the <em>missing value

    default</em>), but for various reasons that isn't the way this

    specification actually defines it.</p>

    <h4 id="syntax-references">References</h4>

    <p>A <dfn>valid hash-name reference</dfn> to an element of type <var

    title="">type</var> is a string consisting of a U+0023 NUMBER SIGN

    (<code title="">#</code>) character followed by a string which

    exactly matches the value of the <code title="">name</code>

    attribute of an element in the document with type <var

    title="">type</var>.</p>

    <p>The <dfn>rules for parsing a hash-name reference</dfn> to an

    element of type <var title="">type</var> are as follows:</p>

    <ol>

     <li><p>If the string being parsed does not contain a U+0023 NUMBER

     SIGN character, or if the first such character in the string is the

     last character in the string, then return null and abort these

     steps.</p></li>

Revision control