Conversation
my suggestion was to memoize the result of parsing the format string, although not 100% clear if better. Like parsing |
|
If you want to try this out, I made a little benchmark here: https://github.com/tgamblin/spec-format-perf |
|
Ok that makes more sense. I think actually memoizing the format result during setup is probably going to be a bigger win. |
This removes a bunch of old methods and makes the color formatting logic a lot clearer.
- [x] Remove `colorize_spec()`, which uses older, now incorrect colorization logic. It
is older than the current `Spec.format()` implementation and it doesn't work with
`@=`. The only usages in `cmd/info.py` can be replaced wtih `Spec.cformat()`.
- [x] Remove `_SEPARATORS` AND `COLOR_FORMATS`, which are no longer needed.
- [x] Simplify `Spec.format()` code to use more readable color constants directly.
- [x] Remove unused color constants.
- [x] Remove unused parameter for `write_attribute()`
- [x] Remove unused `Spec.colorized()` method
Further performance improvement by avoiding object construction `match_to_ansi` was being constructed every time. We can avoid an unnecessary object creation by not using a class here.
Optimize format speed by avoiding object construction.
We can get the regex engine to do more work and to avoid string comparisons.
Avoid a constructor call if it's not needed.
|
As of now I'm getting:
So this is ~2.3x faster at least on an m1 mac. |
`transform=` adds a whole lot of extra complexity to `Spec.format()`, and the thing it was originally intended for (the `decorator=` arg to `spack.cmd.display_specs()`) is now handled entirely in `display_specs()`. The only place it is still used is in the common modules code, and we can handle what it's used for there with a simple `str.upper()`. If people want to format different components of specs differently, they can call `format()` multiple times.
- [x] raise an exception for an unterminated escape character ('\')
- [x] format_path preserves leading '/' as '\' when on windows
- [x] trivial cases for VariantMap and VersionList - [x] don't bother making a string in `_prev_version` (which is used only in comparisons)
|
Ok now it's 2.59x faster. I think I have wrung all I can out of this. If we can avoid somehow constructing a I think this is ready to go |
spec format speedspec format speed
|
As of now, this is ~2.67x faster than develop and cuts a second off the hdf5 concretization time for me on an m1. |
When looking at where we spend our time in solver setup, I noticed a fair bit of time is spent
in `Spec.format()`, and `Spec.format()` is a pretty old, slow, convoluted method.
This PR does a number of things:
- [x] Consolidate most of what was being done manually with a character loop and several
regexes into a single regex.
- [x] Precompile regexes where we keep them
- [x] Remove the `transform=` argument to `Spec.format()` which was only used in one
place in the code (modules) to uppercase env var names, but added a lot of complexity
- [x] Avoid escaping and colorizing specs unless necessary
- [x] Refactor a lot of the colorization logic to avoid unnecessary object construction
- [x] Add type hints and remove some spots in the code where we were using nonexistent
arguments to `format()`.
- [x] Add trivial cases to `__str__` in `VariantMap` and `VersionList` to avoid sorting
- [x] Avoid calling `isinstance()` in the main loop of `Spec.format()`
- [x] Don't bother constructing a `string` representation for the result of `_prev_version`
as it is only used for comparisons.
In my timings (on all the specs formatted in a solve of `hdf5`), this is over 2.67x faster than the
original `format()`, and it seems to reduce setup time by around a second (for `hdf5`).
When looking at where we spend our time in solver setup, I noticed a fair bit of time is spent
in `Spec.format()`, and `Spec.format()` is a pretty old, slow, convoluted method.
This PR does a number of things:
- [x] Consolidate most of what was being done manually with a character loop and several
regexes into a single regex.
- [x] Precompile regexes where we keep them
- [x] Remove the `transform=` argument to `Spec.format()` which was only used in one
place in the code (modules) to uppercase env var names, but added a lot of complexity
- [x] Avoid escaping and colorizing specs unless necessary
- [x] Refactor a lot of the colorization logic to avoid unnecessary object construction
- [x] Add type hints and remove some spots in the code where we were using nonexistent
arguments to `format()`.
- [x] Add trivial cases to `__str__` in `VariantMap` and `VersionList` to avoid sorting
- [x] Avoid calling `isinstance()` in the main loop of `Spec.format()`
- [x] Don't bother constructing a `string` representation for the result of `_prev_version`
as it is only used for comparisons.
In my timings (on all the specs formatted in a solve of `hdf5`), this is over 2.67x faster than the
original `format()`, and it seems to reduce setup time by around a second (for `hdf5`).
When looking at where we spend our time in solver setup, I noticed a fair bit of time is spent
in `Spec.format()`, and `Spec.format()` is a pretty old, slow, convoluted method.
This PR does a number of things:
- [x] Consolidate most of what was being done manually with a character loop and several
regexes into a single regex.
- [x] Precompile regexes where we keep them
- [x] Remove the `transform=` argument to `Spec.format()` which was only used in one
place in the code (modules) to uppercase env var names, but added a lot of complexity
- [x] Avoid escaping and colorizing specs unless necessary
- [x] Refactor a lot of the colorization logic to avoid unnecessary object construction
- [x] Add type hints and remove some spots in the code where we were using nonexistent
arguments to `format()`.
- [x] Add trivial cases to `__str__` in `VariantMap` and `VersionList` to avoid sorting
- [x] Avoid calling `isinstance()` in the main loop of `Spec.format()`
- [x] Don't bother constructing a `string` representation for the result of `_prev_version`
as it is only used for comparisons.
In my timings (on all the specs formatted in a solve of `hdf5`), this is over 2.67x faster than the
original `format()`, and it seems to reduce setup time by around a second (for `hdf5`).
When looking at where we spend our time in solver setup, I noticed a fair bit of time is spent in
Spec.format(), andSpec.format()is a pretty old, slow, convoluted method.This PR does a number of things:
transform=argument toSpec.format()which was only used in one place in the code (modules) to uppercase env var names, but added a lot of complexityformat().__str__inVariantMapandVersionListto avoid sortingisinstance()in the main loop ofSpec.format()stringrepresentation for the result of_prev_versionas it is only used for comparisons.In my timings (on all the specs formatted in a solve of
hdf5), this is over 2.67x faster than the originalformat(), and it seems to reduce setup time by around a second (forhdf5).