Skip to content

Support non-allocating view on string properties values of JSON in System.Text.Json.Utf8JsonReader #54410

@N0D4N

Description

@N0D4N

Background and Motivation

Currently there is no way to get view on string property value of JSON without allocating string, except cases when string property is in fact number, DateTime or anything that System.Buffers.Text.Utf8Parser supports.
But many converters even inside System.Text.Json need string representation of string property only to parse it and don't use anywhere further, for example, such converters are:

Non-internally non-allocating view on string properties can be used for creating custom StringConverter which will be using custom StringPool for example, which will operate on small set of strings but not known at compile time.
My proposal is to add methods to Utf8JsonReader which will accept buffer of chars where value of string property will be written to.

Proposed API

namespace System.Text.Json
{
    public ref partial struct Utf8JsonReader
    {
        /* Existing APIs */
        public ReadOnlySpan<byte> ValueSpan { get; }
        public ReadOnlySequence<byte> ValueSequence { get; }

        public bool ValueIsEscaped { get; } // Whether the JSON string contains escaped characters
        public bool HasValueSequence { get; } // The string can either be stored in a span or a ReadOnlySequence
        
        public string? GetString(); // How we currently decode JSON strings

        /* Proposed new APIs */
        public void GetString(scoped Span<byte> utf8Destination, out int bytesWritten);
        public void GetString(scoped Span<char> destination, out int charsWritten);
    }

    public partial class JsonEncodedText
    {
         public static void Unescape(ReadOnlySpan<byte> utf8Value, Span<byte> utf8Destination, out int bytesWritten);
    }
}

Usage Examples

Get an allocation-free view of the unescaped UTF8 string

Span<byte> buffer = stackalloc byte[SomeUpperBound];
reader.GetString(buffer, out int bytesWritten); // handles both ValueSpan and ValueSequence representations, 
                                                // throws if source buffer length exceeds that of the target buffer.
ReadOnlySpan<byte> unescapedUtf8Value = buffer.Slice(0, bytesWritten);

Handling of ValueSpan representations only:

Debug.Assert(!reader.HasValueSequence);

ReadOnlySpan<byte> unescapedBuffer = stackalloc byte[0];
if (reader.ValueIsEscaped)
{
      Span<byte> buffer = stackalloc byte[SomeUpperBound];
      JsonEncodedText.Unescape(reader.ValueSpan, buffer, out int bytesWritten);
      unescapedBuffer = intermediate.Slice(0, bytesWritten);
}
else
{
      // avoid copying to an intermediate buffer if escaping is not needed
      unescapedBuffer = reader.ValueSpan;
}

Copying to char buffers

char[] buffer = ArrayPool<char>.Rent(maxLength);

// buffer length needs to be at least as long as reader.ValueSpan/ValueSequence to succeed
reader.GetString(buffer, out int charsWritten);

// consume & return the buffer as usual

Alternative Designs

Can't think of any.

Risks

Name GetChars can be confusing for some users, maybe there can be other, better fit for such method?

Notes

What should happen in case when provided buffer is not of sufficient length? Should exception be thrown or buffer should be written to max, and when its capacity is full method should return?

Metadata

Metadata

Labels

Cost:MWork that requires one engineer up to 2 weeksPriority:2Work that is important, but not critical for the releaseTeam:LibrariesUser StoryA single user-facing feature. Can be grouped under an epic.api-approvedAPI was approved in API review, it can be implementedarea-System.Text.Json

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions