Skip to content

Consider atomizing header names in HttpHeaders and elsewhere #63047

@geoffkizer

Description

@geoffkizer

This is inspired by the atomization of XName and XNamespace in Linq-to-XML, see: https://docs.microsoft.com/en-us/dotnet/standard/linq/atomized-xname-xnamespace-objects.

Currently we represent header names internally using the HeaderDescriptor struct, which holds either a KnownHeader object or a raw string for headers that are not known a priori. We use HeaderDescriptor as the key to the header table in HttpHeaders. Whenever a plain string header name is presented to HttpHeaders, we look it up in the known headers table, and if we find it, we construct a HeaderDescriptor with that KnownHeader instance; if not, we construct a HeaderDescriptor with a raw string. These are called "custom headers" in the code.

KnownHeader provides the following benefits over a custom header:
(1) We can optimize reading headers from the wire by mapping them directly to a KnownHeader and avoiding string allocation.
(2) We can optimize writing headers to the wire by pre-generating the appropriate wire representation.
(3) Equality of KnownHeaders is simply object reference equality, and so lookups in the header table are cheap, compared to custom headers which must use string equality.
(4) For strongly typed headers, we associate relevant metadata for parsing etc (HttpHeaderParser, HttpHeaderType)
(5) We can optimize known values for headers as well, e.g. "deflate" and "gzip" for Content-Encoding

Note that not all KnownHeaders provide custom behavior for (4) and (5); of the 94 defined KnownHeaders, ~26 do not do anything here. These still benefit from (1)-(3) above.

We could achieve benefits (1)-(3) above by atomizing custom header names. When we encounter a header name that's not in our KnownHeaders table, we would fall back to a secondary lookup in a custom headers table that looks up or creates a single, shared KnownHeader for this custom header. (For clarity we'd probably want to rename KnownHeader to something like HeaderDefinition, or even just HeaderDescriptor since we'd no longer need that in this model.) Entries in the table would use WeakRefs so that they are collected when no longer in use.

This would also allow us to simplify some of the logic in HeaderDescriptor -- this essentially goes away since we now always have the equivalent of a KnownHeader for every header name in use. And we could remove the ~26 entries in the known headers table that don't have additional metadata; these effectively just become custom headers, but still get the same benefits of (1)-(3) via atomization.

The tradeoff here is that the first time cost of processing a custom header value is somewhat higher, since we need to create the KnownHeader for it at that point. But that feels like a reasonable tradeoff; most apps are probably encountering custom header names more than once.

This would all be internal and would require no new public API, but we could also consider making the new HeaderDescriptor public in the future, to allow for users to further optimize custom header handling.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-System.Net.HttpenhancementProduct code improvement that does NOT require public API changes/additionstenet-performancePerformance related issue

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions