Skip to content

Conversation

@teo-tsirpanis
Copy link
Contributor

Instead of buffering the compressed bytes to a MemoryStream, we directly write them to the BlobBuilder. This is achieved by defining a write-only stream type that wraps a BlobBuilder.

Instead of buffering the compressed bytes to a `MemoryStream`, we directly write them to the `BlobBuilder`. This is achieved by defining a write-only stream type that wraps a `BlobBuilder`.
Copilot AI review requested due to automatic review settings March 23, 2025 02:55
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 23, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR streamlines the compression of embedded PDB data by writing compressed bytes directly to a BlobBuilder instead of buffering them in a MemoryStream.

  • Removed the MemoryStream buffering and its associated WriteBytes call.
  • Introduced a custom write-only stream (BlobBuilderStream) to directly pipe data into the BlobBuilder.
Comments suppressed due to low confidence (1)

src/libraries/System.Reflection.Metadata/src/System/Reflection/PortableExecutable/DebugDirectory/DebugDirectoryBuilder.EmbeddedPortablePdb.cs:50

  • Consider specifying 'leaveOpen: true' when constructing the DeflateStream to ensure that disposing it does not inadvertently affect the BlobBuilderStream if the builder is used after compression. Alternatively, explicitly override Dispose in BlobBuilderStream to prevent any undesired disposal behavior.
using (var deflate = new DeflateStream(new BlobBuilderStream(builder), CompressionLevel.Optimal))

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-reflection-metadata
See info in area-owners.md if you want to be subscribed.

@jkotas jkotas added the tenet-performance Performance related issue label Mar 23, 2025
@jkotas
Copy link
Member

jkotas commented Mar 23, 2025

Could you please run a small micro-benchmark that validates the change has the expected effect?

@teo-tsirpanis
Copy link
Contributor Author

I got these results when compressing System.Private.CoreLib.pdb (2.12 MB) with and without an intermediate MemoryStream:

BenchmarkDotNet v0.14.0, Windows 10 (10.0.19045.5608/22H2/2022Update)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 9.0.201
  [Host]     : .NET 9.0.3 (9.0.325.11113), X64 RyuJIT AVX2
  DefaultJob : .NET 9.0.3 (9.0.325.11113), X64 RyuJIT AVX2
Method Mean Error StdDev Gen0 Gen1 Gen2 Allocated
Buffered 54.90 ms 1.037 ms 1.065 ms 444.4444 444.4444 444.4444 2946.61 KB
Unbuffered 55.36 ms 1.301 ms 3.835 ms 222.2222 - - 919.46 KB
Benchmark program
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.IO.Compression;
using System.Reflection.Metadata;

BenchmarkRunner.Run<PdbWriterBenchmark>();

[MemoryDiagnoser]
public class PdbWriterBenchmark
{
    public const int BlockSize = 16 << 10;

    private readonly BlobBuilder _data = new();

    [GlobalSetup]
    public void GlobalSetup()
    {
        using var f = File.OpenRead(@"C:\Users\teo\code\ephemeral\PdbWriterBenchmark\System.Private.CoreLib.pdb");
        _data.TryWriteBytes(f, (int)f.Length);
    }

    [Benchmark]
    public int Buffered()
    {
        var compressed = new BlobBuilder();
        var buffer = new MemoryStream();
        using (var compressor = new DeflateStream(buffer, CompressionLevel.Optimal, leaveOpen: true))
        {
            _data.WriteContentTo(compressor);
        }
        compressed.WriteBytes(buffer.GetBuffer(), 0, (int)buffer.Length);
        return compressed.Count;
    }

    [Benchmark]
    public int Unbuffered()
    {
        var compressed = new BlobBuilder();
        using (var compressor = new DeflateStream(new BlobBuilderStream(compressed), CompressionLevel.Optimal, leaveOpen: true))
        {
            _data.WriteContentTo(compressor);
        }
        return compressed.Count;
    }
}

internal sealed class BlobBuilderStream(BlobBuilder builder) : Stream
{
    public override bool CanRead => false;
    public override bool CanSeek => false;
    public override bool CanWrite => true;
    public override long Length => throw new NotSupportedException();
    public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); }
    public override void Flush() { }
    public override int Read(byte[] buffer, int offset, int count) => throw new NotSupportedException();
    public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException();
    public override void SetLength(long value) => throw new NotSupportedException();
    public override void Write(byte[] buffer, int offset, int count) => builder.WriteBytes(buffer, offset, count);
    public override unsafe void Write(ReadOnlySpan<byte> buffer)
    {
        fixed (byte* ptr = buffer)
        {
            builder.WriteBytes(ptr, buffer.Length);
        }
    }
}

It looks like we save LOH allocations. Once #100418 lands, user code can customize chunking and use pooled memory, achieving greater savings.

@jkotas
Copy link
Member

jkotas commented Mar 23, 2025

/ba-g wasm timeouts

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jkotas jkotas merged commit db03493 into dotnet:main Mar 23, 2025
81 of 84 checks passed
@teo-tsirpanis teo-tsirpanis deleted the pdb-compress-opt branch March 23, 2025 21:01
@github-actions github-actions bot locked and limited conversation to collaborators Apr 23, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Reflection.Metadata community-contribution Indicates that the PR has been added by a community member tenet-performance Performance related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants