Conversation
2301397 to
d6ba961
Compare
7cbca98 to
82cf8df
Compare
82cf8df to
df0496e
Compare
13f712e to
befaa6d
Compare
309580e to
0748038
Compare
51084a1 to
905221c
Compare
| ( | ||
| r"\@([\w.\-]*\s*)*(\s*\=\s*\w[\w.\-]*)?", | ||
| lambda scanner, val: self.token(VER, val), | ||
| "BEGIN_PHASE", |
There was a problem hiding this comment.
This bit is really the entire point of this change. Essentially, instead of manually typing out lambdas in our actual lexer specification, we have a more declarative interface.
| _windows_filename_prefix = r"([A-Za-z]:)*?" | ||
| # Spec strings require posix-style paths on Windows | ||
| # because the result is later passed to shlex. | ||
| _windows_filename_re = _windows_filename_prefix + _filename_re_base |
There was a problem hiding this comment.
I haven't changed any of the actual strings here whatsoever, I've just moved them to the top level.
| "BEGIN_PHASE", | ||
| [ | ||
| # '@': begin a Version, VersionRange, or VersionList: | ||
| (r"@([\w.\-]*\s*)*(\s*\=\s*\w[\w.\-]*)?", VER), |
There was a problem hiding this comment.
The only actual change I made to any of these strings is to remove the backslash from r"\@" here since the backslash is simply ignored for any non-metacharacters.
| # Gobble up all remaining whitespace between tokens. | ||
| (r"\s+", None), | ||
| ], | ||
| {"EQUALS_PHASE": [EQ]}, |
There was a problem hiding this comment.
This was the other important reason for this change. imo it was really unclear how the list of tuples we have now ends up triggering changes in the lexer mode, because we were relying on the implementation detail of only allowing exactly two modes.
While we do not add any new modes in this specific change, this makes it a little easier for us to extend the spec syntax later e.g. by having more complex compiler specs. I recall @alalazo was working on exactly that (how to add a new element to spec syntax for compiler specs) a while ago and I think the refactoring in this PR will make that effort easier in the future.
There was a problem hiding this comment.
I was referring to this element of the virtual provider selection changes at #15569 (comment)!
|
|
|
Every test is passing except |
|
I really like the big picture here! In the details, I think there's a better way to set up the constructor for the Lexer object. I don't have time to get into the weeds of this before I leave this evening but I'll try to take a look at the details soon. |
|
@becker33: Yes! Would really love input on the Lexer ctor since the purpose of this change is to make that better. No rush to get back. |
|
Noting a discussion with @trws and @psakievich over slack just now on how it's kinda difficult to handle flags reliably, stemming from #32257. In working on this change yesterday I was struck with how when you pass a string to the I absolutely think that the lexer should be doing the splitting here itself and I believe that doing that might make it easier to rely on specs being able to safely serialize and deserialize without breaking. I probably will not try to do it in this change, but in a separate one that modifies the control flow between the lexer and the parser. |
|
Super helpful rundown from @becker33 on the lex/parse architectural constraints:
If there's any useful refactoring I can figure out on that front (in a separate PR) I'll keep this goal in mind. |
|
@cosmicexplorer: I agree with @becker33 this is a good direction. One question: how's the performance? Spec parsing is a bottleneck for importing everything (basically we parse a lot of specs in You can probably test it like this (this imports all the packages): $ spack -p python -c 'import spack.repo; list(spack.repo.path.all_package_classes())'I get |
905221c to
a667add
Compare
|
@tgamblin: I believe the overall runtime regresses ~0.56%, and the performance of Methodology and perf regression specifics below. Important points in bold. Overall RuntimeWall-clock time seems to stay mostly constant at around 23.5 seconds, regressing ~.1318 seconds, or 0.56%. This PR's mean time was 23.4924, while # cosmicexplorer@terrestrial-gamma-ray-flash: ~/tools/s1 23:55:33
; for _ in $(seq 5); do time (spack python -c 'import spack.repo; list(spack.repo.path.all_package_classes())' >/dev/null); done
22.75s user 0.38s system 100% cpu 23.133 total
( spack python -c > /dev/null; ) 22.75s user 0.38s system 99% cpu 23.133 total
23.18s user 0.36s system 100% cpu 23.535 total
( spack python -c > /dev/null; ) 23.18s user 0.36s system 99% cpu 23.536 total
23.19s user 0.38s system 99% cpu 23.571 total
( spack python -c > /dev/null; ) 23.19s user 0.38s system 99% cpu 23.572 total
23.19s user 0.37s system 99% cpu 23.555 total
( spack python -c > /dev/null; ) 23.19s user 0.37s system 99% cpu 23.556 total
23.30s user 0.37s system 99% cpu 23.665 total
( spack python -c > /dev/null; ) 23.30s user 0.37s system 99% cpu 23.665 total
# cosmicexplorer@terrestrial-gamma-ray-flash: ~/tools/s1 23:57:50
; git checkout develop
Switched to branch 'develop'
Your branch is up to date with 'upstream/develop'.
# cosmicexplorer@terrestrial-gamma-ray-flash: ~/tools/s1 23:57:55
; for _ in $(seq 5); do time (spack python -c 'import spack.repo; list(spack.repo.path.all_package_classes())' >/dev/null); done
23.14s user 0.35s system 100% cpu 23.486 total
( spack python -c > /dev/null; ) 23.14s user 0.35s system 100% cpu 23.487 total
23.14s user 0.35s system 100% cpu 23.488 total
( spack python -c > /dev/null; ) 23.14s user 0.35s system 99% cpu 23.489 total
22.63s user 0.36s system 99% cpu 22.985 total
( spack python -c > /dev/null; ) 22.63s user 0.36s system 99% cpu 22.985 total
22.91s user 0.37s system 99% cpu 23.282 total
( spack python -c > /dev/null; ) 22.91s user 0.37s system 99% cpu 23.283 total
23.19s user 0.37s system 100% cpu 23.558 total
( spack python -c > /dev/null; ) 23.19s user 0.37s system 99% cpu 23.559 totalThe above shell commands and output were generated after maybe 5-10 "warm-up" runs running the same command. Without the warm-up runs, the runtime was around 33 seconds, which is what we see in the profiled runs below. I felt comfortable using the profile output regardless, since I assumed that the warm-up would only affect e.g. kernel filesystem caches, and not the performance of parsing itself. Method-level ProfilingIn this section we try to identify where that additional ~.13 seconds comes from, and can tentatively attribute it to If this change is correct, none of the regex matching or This hypothesis is tentatively proven correct below. Profile ExcerptI ran the same spack command as above with If I'm interpreting the Top 40 Profile OutputThis is the precise command line, wall-clock time, and On On |
a667add to
b2df9df
Compare
|
I think we can pause reviewing this one for now, since the biggest benefit of this PR would be to enable #33016. I still have a bit more to do on that one, but when I can demonstrate that #33016 conclusively improves spec parsing, I think it will be easier to sign off on this component of that larger change. |
|
Superseded by #34151 |
Problem
The
SpecLexerworks fine, but is a little hard to extend, which we may wish to do when expanding the spec syntax as in #24025.Solution
Lexerinterface to allow navigation between multiple lexer modes by name.lambda scanner, val: ...in subclasses ofLexerfor a more declarative interface.Result
The spec lexing code should be a little easier to follow. The path to extend the spec syntax as proposed in #20256 (comment) should also be more clear now.