performance: add a verification file to the database#14693
performance: add a verification file to the database#14693tgamblin merged 6 commits intospack:developfrom
Conversation
|
@aweits: cool! Do you have numbers on where/how much this improves things? |
|
pseudo-objectively: 'time spack install gmake' without: with: But the subjective experience of watching the display after it's in place should make you very happy :-) |
|
This probably highlights it better: |
|
@tgamblin, this should fix the installation performance issue that @adamjstewart mentioned in #14190. This really cuts down on the time that spack spends finding packages that are already installed. |
|
@tgamblin I tested this PR myself. I created a test package that looks like: class Example(Package):
url = "http://zlib.net/fossils/zlib-1.2.11.tar.gz"
version('1.2.11', sha256='c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1')
depends_on('python')
def install(self, spec, prefix):
touch(prefix.foo)On develop, on my circa-2013 MBP: $ time spack install example
==> Warning: [email protected] cannot build optimized binaries for "ivybridge". Using best target possible: "x86_64"
==> libiconv is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/libiconv-1.16-gj3jnj46xug7rpjxwaepe6uwx35jdphz
==> diffutils is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/diffutils-3.7-4irrvtxj6myrrxufikynf3myj2u3plor
==> bzip2 is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/bzip2-1.0.8-boqgsigbjrp3xfvcfphkqhsexqrw2iwd
==> expat is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/expat-2.2.9-ctgvqggynhtr2zdidy2f64gk5vgt6jpg
==> pkgconf is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/pkgconf-1.6.3-t64p5ucui6qgo6tepzu44q3hrgpvabrx
==> ncurses is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/ncurses-6.1-jdnritbfxcmk4kfhfsw7d3xuaeluqqvw
==> readline is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/readline-8.0-n4beggn2l64dzrf6elkjcixrklp46ghh
==> gdbm is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/gdbm-1.18.1-jxyriuyf6a477imqrwkmwbanpy6ngoex
==> xz is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/xz-5.2.4-nahnpskrw2jjhtsviwietm4qcs5sa7oe
==> zlib is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/zlib-1.2.11-cvrek7vtvob6v2dfnvdowu2rkqtqtmmi
==> libxml2 is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/libxml2-2.9.9-mjxx7gnpw6wvpbzll2dl3n5dgsbbxd6y
==> tar is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/tar-1.32-lm2b3otglbcasdxrqqpen6lu6x3tiyr4
==> gettext is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/gettext-0.20.1-qbrme2sb4pm3mglxuacgwdxfba464ych
==> libffi is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/libffi-3.2.1-2mukefjdfv5eb2oyrsawyoqdkh2zndsd
==> perl is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/perl-5.30.1-u3gpsj7ovsgyd43ii4iawwkszuxgyjna
==> openssl is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/openssl-1.1.1d-axb6agqybj5ykz6tsroiirkfk2327l2h
==> sqlite is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/sqlite-3.30.1-gnvq3bbsxwzlkbi2cgsgtfhypc7afpo2
==> python is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/python-3.7.6-3gaqcongtvfw2l553jlr5lz3pug2z27w
==> Installing example
==> Searching for binary cache of example
==> No binary for example found: installing from source
==> Warning: microarchitecture specific optimizations are not supported yet on mixed compiler toolchains [check [email protected] for further details]
==> Using cached archive: /Users/Adam/spack/var/spack/cache/_source-cache/archive/c3/c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1.tar.gz
==> Staging archive: /var/folders/21/hwq39zyj4g36x6zjfyl5l8080000gn/T/Adam/spack-stage/spack-stage-example-1.2.11-zz32i2nau2jmxnlmkp7jhavzdnjqnvur/zlib-1.2.11.tar.gz
==> Created stage in /var/folders/21/hwq39zyj4g36x6zjfyl5l8080000gn/T/Adam/spack-stage/spack-stage-example-1.2.11-zz32i2nau2jmxnlmkp7jhavzdnjqnvur
==> No patches needed for example
==> Building example [Package]
==> Executing phase: 'install'
==> Successfully installed example
Fetch: 0.01s. Build: 0.76s. Total: 0.77s.
[+] /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/example-1.2.11-zz32i2nau2jmxnlmkp7jhavzdnjqnvur
real 1m36.808s
user 1m33.575s
sys 0m1.570sOn this branch: $ time spack install example
==> Warning: [email protected] cannot build optimized binaries for "ivybridge". Using best target possible: "x86_64"
==> libiconv is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/libiconv-1.16-gj3jnj46xug7rpjxwaepe6uwx35jdphz
==> diffutils is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/diffutils-3.7-4irrvtxj6myrrxufikynf3myj2u3plor
==> bzip2 is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/bzip2-1.0.8-boqgsigbjrp3xfvcfphkqhsexqrw2iwd
==> expat is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/expat-2.2.9-ctgvqggynhtr2zdidy2f64gk5vgt6jpg
==> pkgconf is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/pkgconf-1.6.3-t64p5ucui6qgo6tepzu44q3hrgpvabrx
==> ncurses is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/ncurses-6.1-jdnritbfxcmk4kfhfsw7d3xuaeluqqvw
==> readline is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/readline-8.0-n4beggn2l64dzrf6elkjcixrklp46ghh
==> gdbm is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/gdbm-1.18.1-jxyriuyf6a477imqrwkmwbanpy6ngoex
==> xz is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/xz-5.2.4-nahnpskrw2jjhtsviwietm4qcs5sa7oe
==> zlib is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/zlib-1.2.11-cvrek7vtvob6v2dfnvdowu2rkqtqtmmi
==> libxml2 is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/libxml2-2.9.9-mjxx7gnpw6wvpbzll2dl3n5dgsbbxd6y
==> tar is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/tar-1.32-lm2b3otglbcasdxrqqpen6lu6x3tiyr4
==> gettext is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/gettext-0.20.1-qbrme2sb4pm3mglxuacgwdxfba464ych
==> libffi is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/libffi-3.2.1-2mukefjdfv5eb2oyrsawyoqdkh2zndsd
==> perl is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/perl-5.30.1-u3gpsj7ovsgyd43ii4iawwkszuxgyjna
==> openssl is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/openssl-1.1.1d-axb6agqybj5ykz6tsroiirkfk2327l2h
==> sqlite is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/sqlite-3.30.1-gnvq3bbsxwzlkbi2cgsgtfhypc7afpo2
==> python is already installed in /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/python-3.7.6-3gaqcongtvfw2l553jlr5lz3pug2z27w
==> Installing example
==> Searching for binary cache of example
==> No binary for example found: installing from source
==> Warning: microarchitecture specific optimizations are not supported yet on mixed compiler toolchains [check [email protected] for further details]
==> Using cached archive: /Users/Adam/spack/var/spack/cache/_source-cache/archive/c3/c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1.tar.gz
==> Staging archive: /var/folders/21/hwq39zyj4g36x6zjfyl5l8080000gn/T/Adam/spack-stage/spack-stage-example-1.2.11-zz32i2nau2jmxnlmkp7jhavzdnjqnvur/zlib-1.2.11.tar.gz
==> Created stage in /var/folders/21/hwq39zyj4g36x6zjfyl5l8080000gn/T/Adam/spack-stage/spack-stage-example-1.2.11-zz32i2nau2jmxnlmkp7jhavzdnjqnvur
==> No patches needed for example
==> Building example [Package]
==> Executing phase: 'install'
==> Successfully installed example
Fetch: 0.01s. Build: 0.81s. Total: 0.82s.
[+] /Users/Adam/spack/opt/spack/darwin-catalina-x86_64/clang-11.0.0-apple/example-1.2.11-zz32i2nau2jmxnlmkp7jhavzdnjqnvur
real 0m5.247s
user 0m3.528s
sys 0m0.847sYes, that's an 18x speedup. The effect is much more pronounced for packages with a lot of dependencies.
Can confirm! Thanks @aweits!!! |
There was a problem hiding this comment.
The only downside of this PR is that it requires Python built with UUID support. In fact, Spack's Python package defaults to ~uuid. Strangely enough, I'm using a Spack-build python~uuid, but UUID support actually works fine on macOS. At the very least, we should change the default to True. We should also see if this will break Spack for python~uuid users on Linux before merging.
|
Okay, tested this on Linux. As long as UUID is installed on the OS (which is the case on my cluster), then |
|
Possibly important information for validating this performance improvement: my Spack database currently has 834 installed packages, most of which are duplicates due to unintended hash changes. Can't wait for the new concretizer and #14695 |
|
@adamjstewart The "uuid package installed when ~uuid" issue is because every system seems to have a "python-compatible" libuuid that gets automatically picked up by the python configure, even when In fact I find that including libuuid in my views typically breaks downstream installations that link against system libraries as well, e.g.: See also #12166 |
|
This PR will help with module regeneration as well, as brought up in #15030. $ time spack module tcl refresh --delete-tree --yes-to-all
==> Regenerating tcl module files
==> Warning: Quotes in command arguments can confuse scripts like configure.
The following arguments may cause problems when executed:
source /opt/packages/gpjohnsn/spack/opt/spack/linux-centos7-broadwell/intel-19.0.5.281/intel-mkl-2019.5.281-vd3jkq7wb2bzhwj7lmrzv64salew2pru/compilers_and_libraries_2019.5.281/linux/mkl/bin/mklvars.sh intel64 &> /dev/null && python3 -c "import os, json; print(json.dumps(dict(os.environ)))"
Quotes aren't needed because spack doesn't use a shell.
Consider removing them
real 0m47.978s
user 0m47.515s
sys 0m0.476sWith this PR: $ time spack module tcl refresh --delete-tree --yes-to-all
==> Regenerating tcl module files
==> Warning: Quotes in command arguments can confuse scripts like configure.
The following arguments may cause problems when executed:
source /opt/packages/gpjohnsn/spack/opt/spack/linux-centos7-broadwell/intel-19.0.5.281/intel-mkl-2019.5.281-vd3jkq7wb2bzhwj7lmrzv64salew2pru/compilers_and_libraries_2019.5.281/linux/mkl/bin/mklvars.sh intel64 &> /dev/null && python3 -c "import os, json; print(json.dumps(dict(os.environ)))"
Quotes aren't needed because spack doesn't use a shell.
Consider removing them
real 0m5.409s
user 0m5.107s
sys 0m0.305sThat is ~9x faster. |
|
On the full stack from #15030, with 1825 packages, the speed up was even greater. real 2m34.153s
user 2m0.396s
sys 0m3.725sI do not have the exact time from the original but it was about 9.5 hours, so |
becker33
left a comment
There was a problem hiding this comment.
I think we should probably put a try/except gate around the import of uuid. That way we don't break anything for users who don't have uuid available. Then we can just write the empty string as the verifier if we don't have uuid available (and lose the performance benefits in that case, but without causing errors).
|
Ping @aweits. Can you rebase and address the above change request? |
|
@adamjstewart, @becker33, will do - things have been challenging lately. |
Add a file containing a unique uuid that is regenerated at database write time. Use this uuid to suppress re-parsing the database contents if we know a previous uuid and the uuid has not changed.
verifier when it resets.
the tests reset the last seen verifier in between tests that use the database fixture.
|
Thanks @aweits! This is in and we'll backport it to 0.14 in v0.14.2. |
| with open(self._verifier_path, 'w') as f: | ||
| new_verifier = str(uuid.uuid4()) | ||
| f.write(new_verifier) | ||
| self.last_seen_verifier = new_verifier |
There was a problem hiding this comment.
@adamjstewart @tgamblin @aweits Apparently when we reach this line we don't update self._data and we have therefore an inconsistent view of the DB later on (since some specs might be referencing roots that have been removed). Removing this line seems to fix #15773. I'll check if we can come up with a more performant update of self._data than re-reading from file.
fixes spack#15773 The performance improvements done in spack#14693 where leaving the DB in an inconsistent state when specs were removed from it. This PR updates the DB internal state whenever it is written to file, but does that using a dictionary that still resides in memory to maintain the performance gain of spack#14693
fixes spack#15773 The performance improvements done in spack#14693 where leaving the DB in an inconsistent state when specs were removed from it. This PR updates the DB internal state whenever it is written to file, but does that using a dictionary that still resides in memory to maintain the performance gain of spack#14693
The performance improvements done in #14693 where leaving the DB in an inconsistent state when specs were removed from it. This PR updates the DB internal state whenever the DB is written to a file. Note that we still cannot properly enumerate installed dependents, so there is a TODO in this code. Fixing that will require the dependents dictionaries in specs to be re-keyed (either by hash, or not keyed at all -- a list would do). See #11983 for details.
Reading the database repeatedly can be quite slow. We need a way to speed
up Spack when it reads the DB multiple times, but the DB has not been
modified between reads (which is nearly all the time).
- [x] Add a file containing a unique uuid that is regenerated at database
write time. Use this uuid to suppress re-parsing the database
contents if we know a previous uuid and the uuid has not changed.
- [x] Fix mutable_database fixture so that it resets the last seen
verifier when it resets.
- [x] Enable not rereading the database immediately after a write. Make
the tests reset the last seen verifier in between tests that use the
database fixture.
- [x] make presence of uuid module optional
The performance improvements done in #14693 where leaving the DB in an inconsistent state when specs were removed from it. This PR updates the DB internal state whenever the DB is written to a file. Note that we still cannot properly enumerate installed dependents, so there is a TODO in this code. Fixing that will require the dependents dictionaries in specs to be re-keyed (either by hash, or not keyed at all -- a list would do). See #11983 for details.
fixes #15030
fixes #15040
Add a file containing a unique uuid that is regenerated at database
write time. Use this uuid to suppress re-parsing the database
contents if we know a previous uuid and the uuid has not changed.