fix: remove spread opts and toString of integrity#71
Conversation
$ npm view ssri engines
{ node: '^14.17.0 || ^16.13.0 || >=18.0.0' }we can use opts?.algorithms et al |
| return | ||
| } | ||
| if (strict && !SPEC_ALGORITHMS.some(a => a === match[1])) { | ||
| if (strict && SPEC_ALGORITHMS[match[1]] !== true) { |
There was a problem hiding this comment.
We can keep SPEC_ALGORITHMS as an array and use .includes
There was a problem hiding this comment.
Using includes is little bit slower than just the property index, do you have some reason to keep it as an array?
There was a problem hiding this comment.
It's a balance against performance and developer experience. I think "a little bit" slower is ok here given that the vast majority of an npm install is disk and io bound.
There was a problem hiding this comment.
I did a mistake, using includes is faster if we don't know the value:
const Benchmark = require('benchmark');
const suite = new Benchmark.Suite();
const SPEC_ALGORITHMS = {
sha256: true,
sha384: true,
sha512: true,
};
const SPEC_ALGORITHMS_ARRAY = Object.keys(SPEC_ALGORITHMS);
const randomAndUnkown = [...SPEC_ALGORITHMS_ARRAY, 'test'];
suite
.add('includes', function () {
const random = randomAndUnkown[Math.floor(Math.random() * randomAndUnkown.length)];
const r = SPEC_ALGORITHMS_ARRAY.includes(random);
})
.add('index access', function () {
const random = randomAndUnkown[Math.floor(Math.random() * randomAndUnkown.length)];
const r = SPEC_ALGORITHMS[random] === true;
})
.on('cycle', function (event) {
console.log(String(event.target));
})
.run({ async: false });Perf:
includes x 83,638,547 ops/sec ±1.96% (92 runs sampled)
index access x 28,349,129 ops/sec ±2.07% (90 runs sampled)
My assumptions not always are good, I forgot that random access for an object is slower.
I will turn back to includes.
|
I really appreciate the time you put into this, can we maybe break this up so that the changes aren't so huge in one PR? Specifically I'm worried about the default opts handling. It'd be nice to isolate those changes against the other tweaks. |
|
@wraithgar What about 3 PRs:
It will be better? |
|
Yes I think 3 PRs would be ok. |
|
@wraithgar First one created at #72, when it was merged, I create the next one. |
I was looking the CPU profiler of pnpm and I saw this call:
https://github.com/pnpm/pnpm/blob/ef6c22e129dc3d76998cee33647b70a66d1f36bf/store/cafs/src/getFilePathInCafs.ts#L29-L30
I thought about what I could do to optimize and then I found good performance improvements.
Removing spread of opts
The first thing I notice was the spread of
defaultOptsin every method, sometimes being called twice without needing.So remove all the calls, before this change:
ssri.parse(base64, { single: true }) x 2,119,460 ops/sec ±1.93% (90 runs sampled) ssri.parse(base64, { single: true, strict: true }) x 1,376,919 ops/sec ±0.93% (86 runs sampled) ssri.parse(parsed, { single: true }) x 685,384 ops/sec ±0.91% (95 runs sampled) ssri.parse(parsed, { single: true, strict: true }) x 448,575 ops/sec ±0.87% (95 runs sampled)With the deletion of
opts:ssri.parse(base64, { single: true }) x 4,928,681 ops/sec ±2.46% (85 runs sampled) ssri.parse(base64, { single: true, strict: true }) x 2,339,789 ops/sec ±0.83% (96 runs sampled) ssri.parse(parsed, { single: true }) x 1,531,463 ops/sec ±1.10% (88 runs sampled) ssri.parse(parsed, { single: true, strict: true }) x 805,785 ops/sec ±1.24% (87 runs sampled)benchmark.js
Faster toString of Integrity
I look at a bunch of maps and filters and I just rewrite everything to perform the same operation without a bunch of loops.
With this optimization, we gain a little bit more performance:
ssri.parse(base64, { single: true }) x 5,046,410 ops/sec ±0.98% (93 runs sampled) ssri.parse(base64, { single: true, strict: true }) x 2,306,927 ops/sec ±1.26% (94 runs sampled) ssri.parse(parsed, { single: true }) x 2,597,882 ops/sec ±1.19% (92 runs sampled) ssri.parse(parsed, { single: true, strict: true }) x 1,005,282 ops/sec ±0.79% (96 runs sampled)But with this change, I introduce a little breaking change, before, when calling
toStringof Integrity, the order of the hashes was defined by the order of the hashes during the insert/parsing.Now, to optimize and avoid calling
Object.keysonstrictmode, I just call the properties directly, so the order will always be deterministic as:sha512,sha384, andsha256. If I change the order of these calls, the tests break.If you think this is a problem, I can call
Object.keyseven in strict mode (-40k/ops).Faster integrity check when is stream
I also take a look at streams mode because PNPM also verify the integrity of the files using streams.
The initial version was already fast compare to the main:
I also saw that
checkStreamdoesn't support the optionsingleand almost all verifications that are done by PNPM only verify a single hash, so I see an opportunity to push the performance a little bit further.But I did an experiment, If we ignore all the checkStream codes and jump to the final verification, we can achieve this performance:
I put the code in the file above, the assumption is: if we verify only one hash, we can skip a lot of verifications.
So I think I could be good to
ssrito export single hash verifications, what do you think?benchmark-stream.js
In general, with these optimizations, we had a bump of more than 2x in performance.