Skip to content

Meta: Header handling #36801

@sebsebmc

Description

@sebsebmc

Servo uses a combination of http::headers and hyper::headers which is sometimes confusing, causes compatibility issues or doesn't meet the needs of modern browser interop. As additional web features are developed, Servo may need more agility in being able to define, parse and encode headers. Currently support for additional headers is spread out in multiple locations.

hyper::headers sees very few updates and Servo increasingly runs into issues dealing with headers that either are not supported or are missing features or full spec compliance.

http::headers strongly enforces that HeaderValue only contains visible ASCII, but this is too restrictive (maybe? see spec note below)

Current areas for improvement:

  • Quality values: this accounts for the majority of "disabled" headers in the hyper headers crate. Servo has shared/net/quality.rs that includes some types for dealing with quality values that do not appear to be used anywhere. Servo uses static quality string values everywhere currently.
  • net/fetch/headers.rs currently defines the various sec-fetch-* headers and does not support parsing these values. The main benefit of being able to parse these would be type safety but I don't actually see this value being checked against anywhere currently.
  • script/dom/headers.rs implements the DOM Header type but includes some parsing and encoding code that could be shared with other parts of Servo.
  • script/dom/xmlhttprequest.rs implements some parsing to test for header field values and also implements header vaue whitespace removal.
  • shared/net/fetch/headers.rs implements various header parsing bits
  • script/dom/str.rs contains tests for valid header token values
  • Content-Disposition is incapable of parsing or encoding anything beyond the content disposition type. This is makes dealing with multipart/form-data particularly painful.
  • Cross-Origin-Resource-Policy could gain type safety from implementing the Header type for it.
  • There's a few TODO comments in the code base for headers that currently do not have their own types.

Issues with various libraries, specs, and the real world

  • Access-Control-Request-Headers unfortunately is an outlier in how it joins values so we currently have to manually join these values.
  • The specs seem to disagree on whether non-ASCII or even invalid UTF-8 are allowed in headers. Some WPT tests exist that expect non-ASCII values to be retained/reflected.
  • The specs indicate that there should be no whitespace ("not even 'bad' whitespace") around = in field parameters but testing Chrome and Firefox showed support for whitespace on at least one side of = and do not handle this the same way. (Chrome allows whitespace on either side, Firefox seems to only allow it after the =)
  • The Fetch spec defines a "header value" extremely broadly due to compatibility concerns. (Only disallows CR, LF, and NUL)
  • The Fetch spec allows for header value lists to start with a comma, hyper parsing does not seem to allow this and there are WPT tests that fail for ACAH and ACAM as a result.

Proposed work:

  • Centralize helpers for defining, parsing, and encoding headers
  • Migrate currently untyped header usage to typed headers where possible

Open questions:

  • How strictly to follow (which?) specs and how much to follow the behavior of other browsers

Relevant specs:

  • RFC9110 which obsoletes RFC7230
  • RFC2616
  • Structured field proposals for defining new headers: RFC9651 which obsoletes RFC8941
  • There are also many RFCs and W3C specs for specific headers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-networkB-metaThis issue tracks the status of multiple, related pieces of work

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions