HTML Standard

本标准详细地定义了 Web 平台的很大一部分内容。Web 平台的标准栈中，它相对于其他标准的定位可以总结为：

1.2 这是 HTML5 吗?

详细地说，广义的 "HTML5" 是指一系列的现代 Web 技术。这些技术中很多都由 WHATWG 开发，本文档就是其中之一。其他的可以在这里访问： the WHATWG Standards overview。

1.3 背景

HTML 是万维网的核心标记语言。最初 HTML 主要设计为在语义级别描述科学文档的语言。然而它的总体设计使它在接下来的几年中能够适配一些其他类型的文档，甚至包括应用程序。

1.4 受众

本标准是为以下读者提供的：使用了本标准定义的特性的文档和脚本的作者，用于操作那些使用了本标准定义的特性的页面的工具的实现者，希望根据本标准的要求确认文档或实现的正确性的个人。

本文档对还没有整体熟悉 Web 技术的读者可能不合适, 因为在有些地方为精确性而省略了具体阐述，为完整性而较为简洁概括。易懂的教程和编写指南可以为这个话题提供更加温和的介绍。

具体地，为了完整地理解本标准中那些更加技术的部分，对 DOM 基础的熟悉是必要的。对 Web IDL，HTTP，XML，Unicode，字符编码，JavaScript 和 CSS 的理解在有些地方也有帮助，但并非完全必要。

1.5 范围

本标准只限于为编写可访问的 Web 页面（从静态文档到动态应用）提供语义级别的标记语言，以及相应的语义级别的脚本 API。

本标准的范围不包括提供具体媒体的自定义呈现机制（尽管 Web 浏览器的默认渲染规则在本标准末尾有所涵盖，一些挂钩到 CSS 的机制也作为语言的一部分加以提供）。

本标准的范围并非要描述整个操作系统。具体地，硬件配置软件、图像操作工具、以及用户在高端工作站日常使用的应用程序都在范围之外。对于应用软件，本标准只针对用户偶尔使用的，或定期使用但在不同地点的，且 CPU 要求较低的特定应用。这样的应用例如在线购物系统、搜索系统、游戏（尤其是多人在线游戏）、公开电话簿或地址簿、通信软件（邮件客户端、即时通信客户端、讨论软件），文档编辑软件等。

1.6 历史

在最初的5年中（1990-1995），HTML经历了若干次修订和扩展。最初由 CERN 主要托管，随后是 IETF。

随着 W3C 的诞生，HTML 的开发再次易主。1995 年第一次扩展 HTML 的尝试（HTML 3.0）以失败告终，随后转变为更加务实的 HTML 3.2，在 1997 年完成。就在同年很快开始了 HTML4 的开发。

随后一年 W3C 成员决定停止 HTML 的演变，并开始开发基于 XML 的替代品 XHTML。该工作首先将 HTML4 重新规划为 XML，也就是著名的 XHTML 1.0。唯一增加的特性就是新的序列化，该工作在 2000 年完成。 XHTML 1.0 之后，在 XHTML 模块化的口号下 W3C 的主要精力转向让其他工作组更容易地扩展 XHTML。与此同时，W3C 致力于一门新的、与此前的 HTML 和 XHTML 都不兼容的语言：XHTML2。

大概在 1998 年 HTML 停止演化的时候，浏览器厂商开发的部分 HTML API 被标准化和发布在 DOM Level 1（1998）和 DOM Level 2 Core，以及 DOM Level 2 HTML（2000年开始2003年完成）中。在 2004 年发布了一些 DOM Level 3 标准，但是 Level 3 草案尚未全部完成工作组就关闭了。这些工作也最终不了了之。

2003 年 XForms （一项定位于下一代 Web 表单的技术）的发布重新激起了对 HTML 演化的兴趣，而不是像从前那样寻求新的替代品。这时人们发现 XML 作为 Web 技术的部署只局限于全新的技术（比如 RSS 和后来的 Atom），而不是取代已经部署的技术（比如 HTML）。

概念验证显示，在不要求浏览器实现与现存 HTML Web 页面不兼容的渲染引擎的情况下，也可能扩展 HTML4 的表单来提供 XForms 1.0 引入的很多特性。这一概念验证是对 HTML 重新燃起的兴趣产生的第一项成果。在这早期阶段，虽然本草案已经公开可用且已广泛征求建议，该标准只在 Opera Software 版权下发布。

应该重新开启 HTML 演化的想法在 W3C 工作组得到了测试。 Mozilla 和 Opera 共同向 W3C 工作组提交了提议，包括 HTML5 工作背后的一些原则（见下文），以及前述的只涉及表单特性的早期草案。该提议因为与此前选择的 Web 演化方向冲突而被拒绝， W3C 职员和会员投票支持继续开发基于 XML 的替代品。

此后 Apple，Mozilla 和 Opera 共同声明他们将继续在 WHATWG（一个新的组织）团体下投入工作。他们为此创建了一个公开的邮件列表，草案也移交到了 WHATWG 网站。随后版权也修改为这三家共同拥有，同时允许该标准的重用。

WHATWG 基于若干核心原则，具体地：技术需要向后兼容，标准和实现需要相符（即使这意味着更改标准而不是实现），标准需要足够详细来使得一个实现在不需对其他实现进行逆向工程的情况下，就能可达到完全的互操作性。

其中后一个原则，要求 HTML5 标准的范围应包括先前在 HTML4，XHTML1，和 DOM2 HTML 这三篇独立的文档中标准化的内容。同时意味着相比于此前的考虑，需要显著地引入更多的细节。

2006 年，W3C 暗示了参与 HTML5 开发的兴趣，并于 2007 年组建了工作组来与 WHATWG 共同开发 HTML5 标准。 Apple，Mozilla，和 Opera 允许 W3C 在 W3C 版权下发布该标准，只要保留一版 WHATWG 网站的那份较少限制的许可协议。

数年中各方一同工作，然而在 2011 年，这些工作组最终发现他们有着不同的目标： W3C 希望发布一个"完成的" HTML5 版本，而 WHATWG 希望持续地维护一个 HTML Living Standard，持续地维护该标准而不是锁定在一个带着已知问题的状态，同时按照需求增加新的特性来发展整个平台。

从此 WHATWG 一直在（与其他组织一同）开发该标准， W3C 则复制 WHATWG 的修复工作到他们所在的文档分支（也有其他的一些改动）。

1.7 设计笔记

HTML，它的 DOM API，以及很多其他的支持技术是由互不相识的有着不同目的的人，经数十年开发完成。

因此特性的出现有着不同的来源，也未经专门的一致的方式设计。而且由于 Web 独有的特点，实现的 Bug 常常会变成事实上的惯例，以及现在合法的、标准的行为。因为编写的内容常常会无意地依赖于这些未能及时修复的实现。

尽管这样，人们还是为坚持某些设计目标而付出努力。在接下来的几个小节中描述了这些努力。

1.7.1 脚本执行的可串行性

为了避免让 Web 作者处理多线程的复杂性，HTML 和 DOM API 的设计使得脚本无法检测同时运行的其他脚本。甚至在使用 workers 时，设计意图中实现的行为可以被认为是在所有浏览环境中串行地执行所有脚本。

1.7.2 与其他标准的兼容性

本标准和多种多样的其他有相互影响和依赖。很不幸在某些情况下，相互冲突的需求使得本标准违反了其他标准的要求。当这种情况发生时，每一项冲突都会标记为 "willful violation"，而且会说明该冲突的原因。

1.7.3 可扩展性

1.8 HTML 和 XML 语法

本标准定义了一个描述文档和应用的抽象语言，以及一些 API 来与使用该语言的资源的内存表示进行交互。

对于传输使用了该抽象语言的资源，有很多不同的具体语法。本标准中定义了其中两种。

第一个这样的具体语法是 HTML 语法。这是对多数作者推荐的格式，它与多数遗留的 Web 浏览器相兼容。如果一个文档以text/html MIME type进行传输，然后它将会被 Web 浏览器作为 HTML 文档处理。本标准定义了最新的 HTML 语法，简单地称为 "HTML"。

第二个具体的语法是 XML。当一篇文档以XML MIME type（比如application/xhtml+xml）传输时，它将会被 Web 浏览器当做 XML 文档处理，被 XML 处理器解析。作者需要注意 XML 和 HTML 的处理存在差别；具体地，即使很小的语法错误都将阻止 XML 文档的完整渲染，然而它们在 HTML 语法中将会被忽略。

给 HTML 用的 XML 语法此前被称为 "XHTML"，但本标准不适用该术语（原因之一是 MathML 和 SVG 的 HTML 语法中未使用该术语）。

DOM，HTML 语法，XML 语法不能用来表示同样的内容。例如 HTML 语法不能表示命名空间，但 DOM 和 XML 语法却支持。类似地，HTML 语法可以表示使用 noscript 特性的文档，但 DOM 和 XML 语法却不能。包含 "-->" 的注释只能在 DOM 中表示，HTML 和 XML 语法却不行。

1.9 本标准的结构

1.9.1 如何阅读本标准

阅读本标准的方式与其他标准类似。首先应该完整地阅读多次，然后至少倒着读一次，然后应该从目录中随机选入章节并跟着所有交叉引用读一次。

如下面一致性要求部分所述，本标准描述了各一致性等级的一致性标准。特别地，有些一致性要求适用于生产者（例如作者和他们创建的文档），也有一些适用于消费者（例如 Web 浏览器）。他们的要求是不同的：对生产者的要求声明了什么是允许的，而对消费者的要求声明了软件的行为。

1.9.2 排版惯例

元素、属性或 API 的定义标记为这样。该元素、属性或 API 的引用标记为 这样。

有些情况下，要求以列表的形式给出，列表项包括条件和对应的要求。在该情况下，适用于某项条件的要求紧跟着条件出现，这些要求的条件可能有很多个。例如：

1.10 HTML 的简单介绍

HTML 文档由树状的元素和文本组成。源码中每个元素由一个开始标签（例如 "<body>"）和一个结束标签（例如 "</body>"）表示。（某些开始标签和结束标签在某些情况下可以省略。）

标签的嵌套必须使所有标签都完全在其他标签内部，不应出现重叠：

本标准定义了 HTML 中使用的一些元素，以及它们的嵌套规则。

元素可以有属性来控制元素的行为。在下面的例子中是一个超链接，由一个a元素和它的 href 属性组成：

属性应放置在开始标签中，由属性名和属性值构成，以 "=" 分隔。如果不包含空格" ' ` = < 或 >，属性值可以保持没有引号；否则必须使用单引号或双引号。如果属性值是空字符串，属性值以及 "=" 可以一起省略。

HTML 用户代理（比如 Web 浏览器）解析这些标记，把它转化为 DOM（Document Object Model）树。 DOM 树是文档在内存中的表示。

树中的document 元素是 html 元素，在 HTML 文档的这一位置总是应该有这样一个元素。它包含两个元素， head 和 body，以及它们之间的一个 Text 节点。

DOM 树中的 Text 节点比你预想的要多，因为源码包含一些空格（表示为"␣"）和换行（"⏎"），这些都会变成 DOM 中的 Text 节点。然而因为历史原因，并非所有这些源码中的空格和换行都出现在 DOM 中。具体地，所有head开始标签前的空白都会被悄悄地丢掉，所有body结束标签后的空白都会出现在body的尾部。

head 元素包含一个 title 元素，其中包含一个内容为 "Sample page" 的 Text 节点。类似地，body 元素包含一个 h1 元素，一个p 元素，和一个注释。

页面中的脚本可以操作该 DOM 树。脚本（比如 JavaScript）是可以使用 script标签或事件处理内容属性内嵌的小程序。比如下面是一个表单，其中的脚本设置了表单中output元素的内容并输出 "Hello World"：

DOM 树中的每个元素表示为一个对象，这些对象提供 API 以供操作。例如，一个链接（比如上述树中的a元素）的 "href"属性可以通过多种方式改变：

由于 DOM 树是实现者（尤其是像 Web 浏览器这样的交互式实现）用来处理和显示 HTML 文档的，本标准更多地是按照 DOM 树来介绍的，而不是按照上面描述的标记代码。

HTML 文档表示了交互式内容的一种媒体无关的描述。HTML 文档可能会渲染在屏幕、语音合成器、或者盲文点触设备。为了精确地控制渲染行为，作者可以使用一个像 CSS 这样的样式语言。

关于更多 HTML 的使用细节，鼓励作者参考教程和指南。本标准中的一些示例可能也有用，但新手作者需要注意本标准必须非常详细地定义该语言，以至于一开始可能很难理解。

1.10.1 使用 HTML 编写安全应用

当使用 HTML 创建交互式网站时，需要注意避免引入漏洞，使得攻击者通过这些漏洞危害网站本身或网站用户信息的完整性。

对这一问题的全面研究超出了本文档的范围，我们强烈建议网站作者去更详细地研究这一问题。即便如此，本节尝试对 HTML 应用程序开发中的一些常见陷阱作简单介绍。

Web 的安全模型基于“域”的概念，因此 Web 上许多潜在的攻击涉及跨域操作。[ORIGIN]

1.10.2 使用脚本 API 时常见的陷阱

HTML 中的脚本有着"运行到完成"的语义，这意味着浏览器通常在执行任何其他操作（诸如触发进一步的事件或继续解析文档）之前会不中断地执行脚本。

另一方面，HTML 文件的解析是逐步发生的，这意味着解析器可以在任何地方暂停来让脚本运行。这通常是一件好事，但确实意味着作者需要小心地避免在事件可能已经触发之后才绑定事件处理程序。

有两项技术能够可靠地完成这件事情：使用事件处理程序内容属性，或者在同一脚本中创建元素和添加事件处理程序。后一种更为安全，因为如前所述，脚本会在进一步的事件发生前一直运行到完成。

1.10.3 编写 HTML 时如何发现错误：验证器和一致性检查器

1.11 对页面作者的一致性要求

与以前版本的 HTML 规范不同，此规范不仅定义了对有效文档的处理，也详细定义了对无效文档的处理过程。

然而，即使无效内容的处理在大多数情况下是良好定义的，文档的一致性仍然很重要：在实践中，互操作性（所有实现以可靠和相同或等同的方式处理特定内容的情况）并不是文档一致性要求的唯一目标。本节详细地介绍了合法文档和错误文档仍然有所区别的一些常见原因。

1.11.1 表示性的标记

先前 HTML 版本的大多数表示性特性不再被允许。一般来说，表示性的标记有这样一些问题：

出于这些原因，在本版本的 HTML 中移除了表示性的标记。这一改动并不突然，HTML4 在很多年前就已经不推荐使用表示性标记，并提供了一种模式（HTML4 Transitional）来帮助作者从表示性标记进行迁移；后来 XHTML 1.1 更进一步地一起废弃了那些特性。

HTML 中唯一保留下来的表示性标记特性是 style 属性和 style 元素。生产环境中使用 style 属性也在某种程度上不再推荐，但在创建原型（这些规则之后可以直接移动到单独的样式表中）以及在特殊情况下（当单独的样式表不方便时）提供特定样式很有用。类似地，style 元素在引入外部样式内容或者提供页面特定样式时很有用。但一般来说，当样式适用于多个页面时，外部样式表可能更加方便。

值得注意的是有些先前的表示性元素在本标准中已经被重新定义为媒体无关的： b, i, hr, s, small, and u.

1.11.2 语法错误

一些作者发现，在实践中总是用引号包含所有的属性并且总是包含所有的可选标签很有帮助，相比于利用 HTML 语法的灵活性而带来的一点简洁，更偏好这一做法带来的一致性。为了协助这样的作者，一致性检查器可以提供执行这一惯例的运行模式。

1.11.3 内容模型和属性值的限制

除了语言的语法，本标准还对如何指定元素和属性做出了限制。这些限制是出于类似的原因：

1.12 推荐阅读

2 公共基础设施

2.1 术语

该标准所说的属性，指的是 HTML，XML 和 IDL 属性，通常在同一个上下文中。当未具体提及所指时，HTML 和 XML 属性指的是 content attributes，而 IDL attributes指的是在 IDL 接口中定义的属性。同样，“属性”这个词也同时指 Javascript 对象属性和 CSS 属性。在可能产生混淆时，该标准使用 object properties 和 CSS properties 来区别。

一般而言，当该标准申明一个特性可以在 HTML 语法或 XML 语法中应用，则在另一个中也可用。当一个特性仅在两个语言之一中应用时，该标准将显式申明它不能在另一个语言中使用，如：“可在 HTML 中使用，…… （该语法不可用于 XML）”。

该标准使用术语文档指任何 HTML，从短小精悍的文档到长篇累牍的论文，亦或是多媒体报告、完整的交互式应用程序。该术语同时指 Document 对象和他们的后裔 DOM 树。至于序列化的字节流，则根据上下文使用 HTML 语法或 XML 语法来表示。

在 DOM 结构的上下文中，使用的术语 HTML 文档和 XML 文档定义在 DOM 标准中，并且特指 Document 对象所处的两种不同模式。 [DOM] （这样的使用都会超链接到它们的定义）

简单起见，像 shown, displayed, 以及 visible 这样的术语可能会用来指明文档渲染给用户的方式，这些术语并不是指应用于视觉媒介；必须考虑将它们以等价的方式应用于其他的媒介。

某个元素是visible的并不仅仅指该元素在视觉上可见，比如屏幕阅读器也应将该元素阅读给用户。

2.1.1 Parallelism

To run steps in parallel means those steps are to be run, one after another, at the same time as other logic in the standard (e.g., at the same time as the event loop). This standard does not define the precise mechanism by which this is achieved, be it time-sharing cooperative multitasking, fibers, threads, processes, using different hyperthreads, cores, CPUs, machines, etc. By contrast, an operation that is to run immediately must interrupt the currently running task, run itself, and then resume the previously running task.

To avoid race conditions between different in parallel algorithms that operate on the same data, a parallel queue can be used.

A parallel queue represents a queue of algorithm steps that must be run in series.

Steps running in parallel can themselves run other steps in in parallel. E.g., inside a parallel queue it can be useful to run a series of steps in parallel with the queue.

2.1.2 资源

本标准中术语支持是指用户代理是否实现了对外部资源的语义的解码能力。支持某种格式或类型是指这一实现可以处理该格式或类型的外部资源，且处理过程不会忽略关键方面。是否支持某种特定资源取决于该资源类型有哪些在用的特性。

例如，如果可以解码和渲染 PNG 图片的像素数据，就可以认为支持 PNG 图片格式。即使这一实现不知道该图片还包含了动画数据。

如果不支持使用的压缩格式，即使实现可以从文件的元数据确定电影的尺寸，MPEG-4 视频文件也不会被视为支持的格式。

有些标准中（特别是 HTTP 标准）中称为 表示（representation） 的在本标准中称为资源（resource）。 [HTTP]

资源的关键子资源是那些需要被正确处理的资源。哪些资源被认为是关键的，由定义该资源格式的标准来定义。

2.1.3 XML 兼容性

除非另有声明，所有本标准中定义和提到的元素均位于 HTML （"http：//www.w3.org/1999/xhtml"），本标准中所有定义和提到的属性（Attribute）没有命名空间。

术语元素类型用于指代给定命名空间和局部名的那些元素。例如，button 元素的元素类型为 button，意味着它们的局部名为 "button" 且（如上述定义地）命名空间为 HTML。

如果属性名匹配 XML 中定义的 Name 生成式且不包含 U+003A COLON 字符（：），那么它就是 XML 兼容的 [XML]

2.1.4 DOM 树

当声明忽略某些元素或属性，或当作其他值处理，或当作其他东西处理时，都是指节点在 DOM 中之后的处理。在这些情形下用户代理禁止改动 DOM。

只有内容属性的新值和原值不同时，才说内容属性的值发生了改变；将内容属性设置到它已有的值不会让它发生改变。

术语空用于属性值、文本 节点、或字符串时，表示文本的长度是零（即不包含控制字符或 U+0020 SPACE）。

插入节点 A 到节点 B 是指以 A 作为参数调用插入步骤，然后 A 新的父节点就是 B。类似地，从节点 B 移除节点 A 是指以 A 作为 removedNode 参数，以 B 作为 oldParent 参数调用移除步骤。

把节点作为参数调用插入步骤后，节点就变成已连接的。类似地，把它作为参数调用移除步骤后，节点就变成了分离的。

2.1.5 脚本

有时会用构造 "一个 Foo 对象" （其中 Foo 其实是一个接口），来表示 "一个实现了 Foo 接口的对象"。

获取 IDL 属性的值称为获取（例如在作者的脚本中），将新的值赋值给 IDL 属性则称为设置。

如果 DOM 对象是活的，该对象上的属性和方法必须在真正的底层数据（而不是数据快照）上进行操作。

2.1.6 插件

术语插件是指一些用户代理定义的内容处理程序，用户代理用它们参与 Document 对象的渲染，但不会作为 Document 的子浏览环境，也不会给 Document 的 DOM 引入任何 Node 对象。

通常，这样的内容处理程序由第三方提供，尽管用户代理也可以将内置的内容处理程序指定为插件。

插件的一个例子是当用户导航到PDF文件时在浏览环境中实例化的PDF查看器。无论执行PDF查看器组件的一方是否与实现用户代理本身的方相同，这将被视为插件。但是根据定义，与用户代理（而不是使用相同的接口）分开启动的PDF查看器应用程序不是插件。

该规范没有定义与插件交互的机制，因为插件预期就是用户代理和平台特定的。 UA 可以选择支持某种插件机制，比如 Netscape Plugin API；也可以选择对某些类型使用远程内容转换器或提供内置支持。实际上，本标准根本没有要求用户代理支持插件。[NPAPI]

例如，在沙盒 iframe 中初始化的安全插件应该阻止其中的内容创建弹出窗口。

在与插件的外部内容交互时，浏览器应该格外小心。当第三方软件以与用户代理本身相同的权限运行时，第三方软件中的漏洞变得与用户代理中的漏洞同样危险。

因为不同的用户有不同的插件，这提供了唯一识别用户的指纹向量，推荐用户代理对每个用户都支持同样的 plugins。

2.1.7 字符编码

字符编码或着没有歧义时说的编码，是一种字节流与 Unicode 字符串的转换，定义在 Encoding 中。编码包括一个编码名称和一个或更多的编码标签，编码的名称和标签定义在 Encoding 标准中。 [ENCODING]

2.1.8 符合性类别

本标准描述了用户代理（实现者相关）和文档（作者和编写工具的实现者相关）的符合性标准。

符合规范的文档是符合所有符合性要求的文档。为了提高可读性，有些符合性要求是对作者提出的；这些是对文档的隐性要求：按照定义所有文档都有对应的作者。（有些情况下，作者本身可能是一个用户代理 — 这些用户代理受其他规则的约束，见下文。）

例如，如果一项要求声明 "作者禁止使用 foobar 元素"，意味着文档不允许包含名为 foobar 的元素。

文档的符合性要求与实现的符合性要求没有隐含的关系。用户代理不能随意处理不合规范的文档；不论输入的文档是否合规，本标准中描述的处理模型都适用。

为了防范诸如 DoS 攻击、内存耗尽，或解决平台相关的限制，用户代理可能会对本来不受约束的输入强加实现相关的限制。

为了兼容既有内容与标准，本标准描述了两种写作格式：一种基于 XML，另一种基于 SGML 启发的自定义格式（称为 HTML 语法）。本标准鼓励实现同时支持以上两种格式，但至少支持其中一种。

一些符合性要求称为元素、属性、方法或对象的要求。这些要求分为两类：描述内容模型限制的，和描述实现行为的。前者是对文档和编写工具的要求，后者是对用户代理的要求。类似地，另一些符合性要求称为对作者的要求；这些要求应解释为对作者产出的文档的符合性要求。（换句话说，本标准不区分对作者和对文档的符合性要求）

2.1.9 Dependencies

This specification relies on several other underlying specifications.

Infra

The following terms are defined in Infra: [INFRA]

The general iteration terms while, continue, and break.
implementation-defined
tracking vector
code point and its synonym character
surrogate
scalar value
tuple
noncharacter
string, code unit, length, and code point length
The string equality operations is and identical to
scalar value string
ASCII whitespace
control
ASCII digit
ASCII upper hex digit
ASCII lower hex digit
ASCII hex digit
ASCII upper alpha
ASCII lower alpha
ASCII alpha
ASCII alphanumeric
isomorphic decode
ASCII lowercase
ASCII uppercase
ASCII case-insensitive
strip newlines
normalize newlines
strip leading and trailing ASCII whitespace
strip and collapse ASCII whitespace
split a string on ASCII whitespace
split a string on commas
collect a sequence of code points and its associated position variable
skip ASCII whitespace
The ordered map data structure and the associated definitions for value, entry, exists, getting the value of an entry, setting the value of an entry, removing an entry, clear, getting the keys, size, and iterate
The list data structure and the associated definitions for append, extend, replace, remove, empty, contains, size, is empty, iterate, and clone
The stack data structure and the associated definitions for push and pop
The queue data structure and the associated definitions for enqueue and dequeue
The ordered set data structure and the associated definition for append and union
The struct specification type and the associated definition for item
The forgiving-base64 encode and forgiving-base64 decode algorithms
HTML namespace
MathML namespace
SVG namespace
XLink namespace
XML namespace
XMLNS namespace

Unicode and Encoding

The Unicode character set is used to represent textual data, and Encoding defines requirements around character encodings. [UNICODE]

This specification introduces terminology based on the terms defined in those specifications, as described earlier.

The following terms are used as defined in Encoding: [ENCODING]

Getting an encoding
Get an output encoding
The generic decode algorithm which takes a byte stream and an encoding and returns a character stream
The UTF-8 decode algorithm which takes a byte stream and returns a character stream, additionally stripping one leading UTF-8 Byte Order Mark (BOM), if any
The UTF-8 decode without BOM algorithm which is identical to UTF-8 decode except that it does not strip one leading UTF-8 Byte Order Mark (BOM)
The encode algorithm which takes a character stream and an encoding and returns a byte stream
The UTF-8 encode algorithm which takes a character stream and returns a byte stream
The BOM sniff algorithm which takes a byte stream and returns an encoding or null.

XML and related specifications

Implementations that support the XML syntax for HTML must support some version of XML, as well as its corresponding namespaces specification, because that syntax uses an XML serialization with namespaces. [XML] [XMLNS]

Data mining tools and other user agents that perform operations on content without running scripts, evaluating CSS or XPath expressions, or otherwise exposing the resulting DOM to arbitrary content, may "support namespaces" by just asserting that their DOM node analogues are in certain namespaces, without actually exposing the namespace strings.

In the HTML syntax, namespace prefixes and namespace declarations do not have the same effect as in XML. For instance, the colon has no special meaning in HTML element names.

The attribute with the name space in the XML namespace is defined by Extensible Markup Language (XML). [XML]

The Name production is defined in XML. [XML]

This specification also references the <?xml-stylesheet?> processing instruction, defined in Associating Style Sheets with XML documents. [XMLSSPI]

This specification also non-normatively mentions the XSLTProcessor interface and its transformToFragment() and transformToDocument() methods. [XSLTP]

URLs

The following terms are defined in URL: [URL]

host
public suffix
domain
IPv4 address
IPv6 address
URL
Origin of URLs
Absolute URL
Relative URL
registrable domain
The URL parser and basic URL parser as well as these parser states:
- scheme start state
- host state
- hostname state
- port state
- path start state
- query state
- fragment state
URL record, as well as its individual components:
- scheme
- username
- password
- host
- port
- path
- query
- fragment
- cannot-be-a-base-URL flag
- object
valid URL string
The cannot have a username/password/port concept
The URL serializer
The host parser
The host serializer
Host equals
URL equals
serialize an integer
Default encode set
component percent-encode set
UTF-8 percent-encode
percent-decode
set the username
set the password
The application/x-www-form-urlencoded format
The application/x-www-form-urlencoded serializer

A number of schemes and protocols are referenced by this specification also:

The about: scheme [ABOUT]
The blob: scheme [FILEAPI]
The data: scheme [RFC2397]
The http: scheme [HTTP]
The https: scheme [HTTP]
The mailto: scheme [MAILTO]
The sms: scheme [SMS]
The urn: scheme [URN]

Media fragment syntax is defined in Media Fragments URI. [MEDIAFRAG]

HTTP and related specifications

The following terms are defined in the HTTP specifications: [HTTP]

`Accept` header
`Accept-Language` header
`Cache-Control` header
`Content-Disposition` header
`Content-Language` header
`Last-Modified` header
`Referer` header

The following terms are defined in HTTP State Management Mechanism: [COOKIES]

cookie-string
receives a set-cookie-string
`Cookie` header

The following term is defined in Web Linking: [WEBLINK]

`Link` header

The following terms are defined in Structured Field Values for HTTP: [STRUCTURED-FIELDS]

The following terms are defined in MIME Sniffing: [MIMESNIFF]

Fetch

The following terms are defined in Fetch: [FETCH]

ABNF
about:blank
An HTTP(S) scheme
A local scheme
A network scheme
A fetch scheme
CORS protocol
default `User-Agent` value
extract a MIME type
fetch
HTTP-redirect fetch
ok status
navigation request
network error
`Origin` header
`Cross-Origin-Resource-Policy` header
process response
getting a structured field value
set
get, decode, and split
terminate
cross-origin resource policy check
the RequestCredentials enumeration
the RequestDestination enumeration
the fetch() method
serialize a response URL for reporting
response and its associated:
- type
- url
- url list
- status
- header list
- body
- internal response
- CSP list
- location URL
request and its associated:
- url
- method
- header list
- body
- client
- URL list
- current URL
- reserved client
- replaces client id
- initiator
- destination
- potential destination
- translating a potential destination
- script-like destinations
- priority
- origin
- referrer
- synchronous flag
- mode
- credentials mode
- use-URL-credentials flag
- unsafe-request flag
- cache mode
- redirect mode
- referrer policy
- cryptographic nonce metadata
- integrity metadata
- parser metadata
- reload-navigation flag
- history-navigation flag

The following terms are defined in Referrer Policy: [REFERRERPOLICY]

referrer policy
The `Referrer-Policy` HTTP header
The parse a referrer policy from a `Referrer-Policy` header algorithm
The "no-referrer", "no-referrer-when-downgrade", "origin-when-cross-origin", and "unsafe-url" referrer policies

The following terms are defined in Mixed Content: [MIX]

a priori authenticated URL

Paint Timing

The following terms are defined in Paint Timing: [PAINTTIMING]

mark paint timing

Long Tasks

The following terms are defined in Long Tasks: [LONGTASKS]

report long tasks

Web IDL

The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in Web IDL. [WEBIDL]

The following terms are defined in Web IDL:

The Web IDL also defines the following types that are used in Web IDL fragments in this specification:

The term throw in this specification is used as defined in Web IDL. The DOMException type and the following exception names are defined by Web IDL and used by this specification:

When this specification requires a user agent to create a Date object representing a particular time (which could be the special value Not-a-Number), the milliseconds component of that time, if any, must be truncated to an integer, and the time value of the newly created Date object must represent the resulting truncated time.

For instance, given the time 23045 millionths of a second after 01:00 UTC on January 1st 2000, i.e. the time 2000-01-01T00:00:00.023045Z, then the Date object created representing that time would represent the same time as that created representing the time 2000-01-01T00:00:00.023Z, 45 millionths earlier. If the given time is NaN, then the result is a Date object that represents a time value NaN (indicating that the object does not represent a specific instant of time).

JavaScript

Some parts of the language described by this specification only support JavaScript as the underlying scripting language. [JAVASCRIPT]

The term "JavaScript" is used to refer to ECMA-262, rather than the official term ECMAScript, since the term JavaScript is more widely known. Similarly, the MIME type used to refer to JavaScript in this specification is text/javascript, since that is the most commonly used type, despite it being an officially obsoleted type according to RFC 4329. [RFC4329]

The following terms are defined in the JavaScript specification and used in this specification:

active function object
agent and agent cluster
automatic semicolon insertion
candidate execution
The current Realm Record
early error
forward progress
invariants of the essential internal methods
JavaScript execution context
JavaScript execution context stack
JavaScript realm
EnvironmentRecord
NewTarget
running JavaScript execution context
surrounding agent
abstract closure
immutable prototype exotic object
Well-Known Symbols, including @@hasInstance, @@isConcatSpreadable, @@toPrimitive, and @@toStringTag
Well-Known Intrinsic Objects, including %Array.prototype%, %Error.prototype%, %EvalError.prototype%, %Function.prototype%, %JSON.parse%, %Object.prototype%, %Object.prototype.valueOf%, %RangeError.prototype%, %ReferenceError.prototype%, %SyntaxError.prototype%, %TypeError.prototype%, and %URIError.prototype%
The FunctionBody production
The Module production
The Pattern production
The Script production
The Type notation
The Completion Record specification type
The List and Record specification types
The Property Descriptor specification type
The Script Record specification type
The Cyclic Module Record specification type
The Source Text Module Record specification type and its Evaluate and Link methods
The ArrayCreate abstract operation
The Call abstract operation
The Construct abstract operation
The CopyDataBlockBytes abstract operation
The CreateByteDataBlock abstract operation
The CreateDataProperty abstract operation
The DetachArrayBuffer abstract operation
The EnumerableOwnPropertyNames abstract operation
The FinishDynamicImport abstract operation
The OrdinaryFunctionCreate abstract operation
The Get abstract operation
The GetActiveScriptOrModule abstract operation
The GetFunctionRealm abstract operation
The HasOwnProperty abstract operation
The HostEnqueuePromiseJob abstract operation
The HostEnsureCanCompileStrings abstract operation
The HostImportModuleDynamically abstract operation
The HostPromiseRejectionTracker abstract operation
The HostResolveImportedModule abstract operation
The InitializeHostDefinedRealm abstract operation
The IsAccessorDescriptor abstract operation
The IsCallable abstract operation
The IsConstructor abstract operation
The IsDataDescriptor abstract operation
The IsDetachedBuffer abstract operation
The IsSharedArrayBuffer abstract operation
The NewObjectEnvironment abstract operation
The NormalCompletion abstract operation
The OrdinaryGetPrototypeOf abstract operation
The OrdinarySetPrototypeOf abstract operation
The OrdinaryIsExtensible abstract operation
The OrdinaryPreventExtensions abstract operation
The OrdinaryGetOwnProperty abstract operation
The OrdinaryDefineOwnProperty abstract operation
The OrdinaryGet abstract operation
The OrdinarySet abstract operation
The OrdinaryDelete abstract operation
The OrdinaryOwnPropertyKeys abstract operation
The ObjectCreate abstract operation
The ParseModule abstract operation
The ParseScript abstract operation
The NewPromiseReactionJob abstract operation
The NewPromiseResolveThenableJob abstract operation
The RegExpBuiltinExec abstract operation
The RegExpCreate abstract operation
The RunJobs abstract operation
The SameValue abstract operation
The ScriptEvaluation abstract operation
The SetImmutablePrototype abstract operation
The ToBoolean abstract operation
The ToString abstract operation
The ToUint32 abstract operation
The TypedArrayCreate abstract operation
The Abstract Equality Comparison algorithm
The Strict Equality Comparison algorithm
The Atomics object
The Date class
The RegExp class
The SharedArrayBuffer class
The TypeError class
The RangeError class
The eval() function
The [[IsHTMLDDA]] internal slot
import()
import.meta
The HostGetImportMetaProperties abstract operation
The typeof operator
The delete operator
The TypedArray Constructors table

Users agents that support JavaScript must also implement ECMAScript Internationalization API. [JSINTL]

WebAssembly

The following term is defined in WebAssembly JavaScript Interface: [WASMJS]

WebAssembly.Module

DOM

The Document Object Model (DOM) is a representation — a model — of a document and its content. The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM. [DOM]

Implementations must support DOM and the events defined in UI Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM interfaces. [DOM] [UIEVENTS]

In particular, the following features are defined in DOM: [DOM]

Attr interface
Comment interface
DOMImplementation interface
Document interface
DocumentOrShadowRoot interface
DocumentFragment interface
DocumentType interface
ChildNode interface
Element interface
attachShadow() method.
An element's shadow root
The retargeting algorithm
Node interface
NodeList interface
ProcessingInstruction interface
ShadowRoot interface
Text interface
node document concept
document type concept
host concept
The shadow root concept, and its delegates focus and available to element internals.
The shadow host concept
HTMLCollection interface, its length attribute, and its item() and namedItem() methods
The terms collection and represented by the collection
DOMTokenList interface, and its value attribute
createDocument() method
createHTMLDocument() method
createElement() method
createElementNS() method
getElementById() method
getElementsByClassName() method
appendChild() method
cloneNode() method
importNode() method
preventDefault() method
id attribute
setAttribute() method
textContent attribute
The tree, shadow tree, and node tree concepts
The tree order and shadow-including tree order concepts
The child concept
The root and shadow-including root concepts
The inclusive ancestor, shadow-including descendant, shadow-including inclusive descendant, and shadow-including inclusive ancestor concepts
The first child and next sibling concepts
The document element concept
The in a document tree, in a document (legacy), and connected concepts
The slot concept, and its name and assigned nodes
The assigned slot concept.
The find flattened slottables algorithm
The assign a slot algorithm
The pre-insert, insert, append, replace, replace all, string replace all, remove, and adopt algorithms for nodes
The insertion steps, removing steps, adopting steps, and children changed steps hooks for elements
The change, append, remove, replace, and set value algorithms for attributes
The attribute change steps hook for attributes
The attribute list concept
The data of a text node
The child text content of a node
The descendant text content of a node
Event interface
Event and derived interfaces constructor behavior
EventTarget interface
The activation behavior hook
The legacy-pre-activation behavior hook
The legacy-canceled-activation behavior hook
The create an event algorithm
The fire an event algorithm
The canceled flag
The dispatch algorithm
EventInit dictionary type
type attribute
target attribute
currentTarget attribute
bubbles attribute
cancelable attribute
composed attribute
composed flag
isTrusted attribute
initEvent() method
add an event listener
addEventListener() method
The remove an event listener and remove all event listeners algorithms
EventListener callback interface
The type of an event
An event listener and its type and callback
The encoding (herein the character encoding), mode, and content type of a Document
The distinction between XML documents and HTML documents
The terms quirks mode, limited-quirks mode, and no-quirks mode
The algorithm to clone a Node, and the concept of cloning steps used by that algorithm
The concept of base URL change steps and the definition of what happens when an element is affected by a base URL change
The concept of an element's unique identifier (ID)
The concept of an element's classes
The term supported tokens
The concept of a DOM range, and the terms start, end, and boundary point as applied to ranges.
The create an element algorithm
The element interface concept
The concepts of custom element state, and of defined and custom elements
An element's namespace, namespace prefix, local name, custom element definition, and is value
MutationObserver interface and mutation observers in general

The following features are defined in UI Events: [UIEVENTS]

The MouseEvent interface
The MouseEvent interface's relatedTarget attribute
MouseEventInit dictionary type
The FocusEvent interface
The FocusEvent interface's relatedTarget attribute
The UIEvent interface
The UIEvent interface's view attribute
auxclick event
click event
dblclick event
mousedown event
mouseenter event
mouseleave event
mousemove event
mouseout event
mouseover event
mouseup event
wheel event
keydown event
keypress event
keyup event

The following features are defined in Touch Events: [TOUCH]

Touch interface
Touch point concept
touchend event

The following features are defined in Pointer Events: [POINTEREVENTS]

pointerup event

This specification sometimes uses the term name to refer to the event's type; as in, "an event named click" or "if the event name is keypress". The terms "name" and "type" for events are synonymous.

The following features are defined in DOM Parsing and Serialization: [DOMPARSING]

The following features are defined in Selection API: [SELECTION]

User agents are encouraged to implement the features described in execCommand. [EXECCOMMAND]

The following parts of Fullscreen API are referenced from this specification, in part to define the rendering of dialog elements, and also to define how the Fullscreen API interacts with HTML: [FULLSCREEN]

top layer (an ordered set) and its add operation
requestFullscreen()
run the fullscreen steps

High Resolution Time provides the current high resolution time and the DOMHighResTimeStamp typedef. [HRT]

File API

This specification uses the following features defined in File API: [FILEAPI]

The Blob interface and its type attribute
The File interface and its name and lastModified attributes
The FileList interface
The concept of a Blob's snapshot state
The concept of read errors
Blob URL Store

Indexed Database API

This specification uses cleanup Indexed Database transactions defined by Indexed Database API. [INDEXEDDB]

Media Source Extensions

The following terms are defined in Media Source Extensions: [MEDIASOURCE]

MediaSource interface
detaching from a media element

Media Capture and Streams

The following terms are defined in Media Capture and Streams: [MEDIASTREAM]

MediaStream interface

Reporting

The following terms are defined in Reporting: [REPORTING]

XMLHttpRequest

The following features and terms are defined in XMLHttpRequest: [XHR]

The XMLHttpRequest interface, and its responseXML attribute
The ProgressEvent interface, and its lengthComputable, loaded, and

HTML

Living Standard — Last Updated 21 May 2022

目录

详细目录

1 概述

1.1 本标准的适用范围