Skip to content

Comments

Migrate to zlib-rs#9184

Merged
charliermarsh merged 1 commit intomainfrom
charlie/zlib-rs
Nov 18, 2024
Merged

Migrate to zlib-rs#9184
charliermarsh merged 1 commit intomainfrom
charlie/zlib-rs

Conversation

@charliermarsh
Copy link
Member

Summary

I've tried this a few times; just curious if it passes tests.

@charliermarsh
Copy link
Member Author

I'm somewhat inclined to make this change? It doesn't look like zlib-rs is widely used yet, but rustls does use it for certificate decompression. And in local benchmarking, I can't find a clear difference between zlib-rs and zlib-ng... Like, sometimes one is faster, sometimes the other is faster (though they're generally both faster than the default flate2 backend). It's really hard to benchmark this, since it only applies to remote wheels that we unzip as we stream, so network IO is also mixed in.

@charliermarsh
Copy link
Member Author

Ok, running against a local HTTP server.

First run:

Benchmark 1: ../../target/release/main (install-cold)
  Time (mean ± σ):      1.880 s ±  0.075 s    [User: 1.004 s, System: 0.803 s]
  Range (min … max):    1.806 s …  2.015 s    10 runs

Benchmark 2: ../../target/release/zlib-rs (install-cold)
  Time (mean ± σ):      1.816 s ±  0.176 s    [User: 0.885 s, System: 0.842 s]
  Range (min … max):    1.727 s …  2.310 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs.

Benchmark 3: ../../target/release/default (install-cold)
  Time (mean ± σ):      2.035 s ±  0.171 s    [User: 1.100 s, System: 0.862 s]
  Range (min … max):    1.949 s …  2.517 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs.

Summary
  ../../target/release/zlib-rs (install-cold) ran
    1.04 ± 0.11 times faster than ../../target/release/main (install-cold)
    1.12 ± 0.14 times faster than ../../target/release/default (install-cold)

Second run:

Benchmark 1: ../../target/release/main (install-cold)
  Time (mean ± σ):      1.869 s ±  0.058 s    [User: 0.996 s, System: 0.796 s]
  Range (min … max):    1.816 s …  1.982 s    10 runs

Benchmark 2: ../../target/release/zlib-rs (install-cold)
  Time (mean ± σ):      1.887 s ±  0.427 s    [User: 0.889 s, System: 0.919 s]
  Range (min … max):    1.718 s …  3.097 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs.

Benchmark 3: ../../target/release/default (install-cold)
  Time (mean ± σ):      2.238 s ±  0.541 s    [User: 1.114 s, System: 1.036 s]
  Range (min … max):    1.945 s …  3.277 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs.

Summary
  ../../target/release/main (install-cold) ran
    1.01 ± 0.23 times faster than ../../target/release/zlib-rs (install-cold)
    1.20 ± 0.29 times faster than ../../target/release/default (install-cold)

main is libz-ng, zlib-rs is this branch, and default is the default flate2 backend.

So, anyway, this seems like a good change? And we get to remove a dependency that has caused a lot of trouble.

@charliermarsh charliermarsh added the internal A refactor or improvement that is not user-facing label Nov 18, 2024
@charliermarsh charliermarsh marked this pull request as ready for review November 18, 2024 04:22
@charliermarsh
Copy link
Member Author

I think the biggest counterargument is just that it isn't widely adopted right now.

@konstin
Copy link
Member

konstin commented Nov 18, 2024

I've confirmed that this removes the cmake build dep on windows

@charliermarsh
Copy link
Member Author

That's a big win.

@zanieb
Copy link
Member

zanieb commented Nov 18, 2024

I'm for it, easy to revert if we need to.

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@charliermarsh charliermarsh merged commit 0aaa6ba into main Nov 18, 2024
@charliermarsh charliermarsh deleted the charlie/zlib-rs branch November 18, 2024 15:45
charliermarsh added a commit that referenced this pull request Mar 3, 2025
## Summary

I believe `zlib-rs` is now a better choice on ARM and x86, so I'm just
going to assume it's a better choice everywhere. It's much easier to
build (removes our CMake dependency), and in my benchmarking, it's
substantially faster on ARM and faster or ~exactly even on my x86
Windows machine.

We migrated to `zlib-rs` once before (#9184); however, I later reverted
it as I learned that they were only doing compile-time feature
detection, and so `zlib-rs` was meaningfully slower on x86. They now
perform runtime feature detection:
https://trifectatech.org/blog/zlib-rs-is-faster-than-c/.

To benchmark, I wrote a script to create a local Simple API-compliant
registry (see the commit history) for a single package. Then I ran the
`install-cold` benchmark against that registry to install NumPy.

On ARM:

```
❯ uv run resolver --uv-pip-path ../../zlib-ng --uv-pip-path ../../zlib-rs \
        --benchmark install-cold \
        req.txt --warmup 10 --min-runs 30
Benchmark 1: ../../zlib-ng (install-cold)
  Time (mean ± σ):     165.7 ms ±  34.7 ms    [User: 64.4 ms, System: 93.2 ms]
  Range (min … max):   141.8 ms … 293.2 ms    30 runs

Benchmark 2: ../../zlib-rs (install-cold)
  Time (mean ± σ):     150.9 ms ±  16.2 ms    [User: 57.4 ms, System: 86.4 ms]
  Range (min … max):   135.3 ms … 202.4 ms    30 runs

Summary
  ../../zlib-rs (install-cold) ran
    1.10 ± 0.26 times faster than ../../zlib-ng (install-cold)
```

I benchmarked this about 100 times on my Windows machine and found it
difficult to conclude anything beyond "They're nearly the same". Here's
an example:

```
PS C:\Users\crmar\workspace\puffin> hyperfine --prepare "uv venv" "zlib-rs.exe pip sync ./scripts/benchmark/req.txt" "zlib-ng.exe pip sync ./scripts/benchmark/req.txt" "zlib-rs.exe pip sync ./scripts/benchmark/req.txt" "zlib-ng.exe pip sync ./scripts/benchmark/req.txt" --runs 10 --warmup 5
Benchmark 1: zlib-rs.exe pip sync ./scripts/benchmark/req.txt
  Time (mean ± σ):     240.6 ms ±  10.8 ms    [User: 6.1 ms, System: 92.2 ms]
  Range (min … max):   229.4 ms … 267.9 ms    10 runs

Benchmark 2: zlib-ng.exe pip sync ./scripts/benchmark/req.txt
  Time (mean ± σ):     241.3 ms ±   6.2 ms    [User: 7.7 ms, System: 90.6 ms]
  Range (min … max):   233.9 ms … 252.1 ms    10 runs

Benchmark 3: zlib-rs.exe pip sync ./scripts/benchmark/req.txt
  Time (mean ± σ):     242.8 ms ±   7.7 ms    [User: 6.2 ms, System: 23.4 ms]
  Range (min … max):   236.1 ms … 262.8 ms    10 runs

Benchmark 4: zlib-ng.exe pip sync ./scripts/benchmark/req.txt
  Time (mean ± σ):     245.9 ms ±   5.7 ms    [User: 1.5 ms, System: 59.4 ms]
  Range (min … max):   240.9 ms … 257.3 ms    10 runs

Summary
  zlib-rs.exe pip sync ./scripts/benchmark/req.txt ran
    1.00 ± 0.05 times faster than zlib-ng.exe pip sync ./scripts/benchmark/req.txt
    1.01 ± 0.06 times faster than zlib-rs.exe pip sync ./scripts/benchmark/req.txt
    1.02 ± 0.05 times faster than zlib-ng.exe pip sync ./scripts/benchmark/req.txt
```

Closes #11885.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal A refactor or improvement that is not user-facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants