Skip to content

database: maintain in memory consistency when removing specs#15777

Merged
tgamblin merged 3 commits intospack:developfrom
alalazo:fix/db_in_memory_update_on_write
Apr 12, 2020
Merged

database: maintain in memory consistency when removing specs#15777
tgamblin merged 3 commits intospack:developfrom
alalazo:fix/db_in_memory_update_on_write

Conversation

@alalazo
Copy link
Copy Markdown
Member

@alalazo alalazo commented Mar 31, 2020

fixes #15773

The performance improvements done in #14693 where leaving the DB in an inconsistent state when specs were removed from it. This PR updates the DB internal state whenever it is written to file, but does that using a dictionary that still resides in memory to maintain the performance gain of #14693.

@alalazo alalazo added pkg-database bugfix Something wasn't working, here's a fix labels Mar 31, 2020
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Mar 31, 2020

@aweits

@aweits
Copy link
Copy Markdown
Contributor

aweits commented Apr 1, 2020

I'm still poking at it, but isn't something like this a better approach? This currently passes all tests along with the one you added here...

diff --git a/lib/spack/spack/database.py b/lib/spack/spack/database.py
index ca9ed67fb5b5..45d148a3d17a 100644
--- a/lib/spack/spack/database.py
+++ b/lib/spack/spack/database.py
@@ -1128,6 +1128,8 @@ def _remove(self, spec):
 
         del self._data[key]
         for dep in rec.spec.dependencies(_tracked_deps):
+            if dep._dependents.get(spec.name):
+                del dep._dependents[spec.name]
             self._decrement_ref_count(dep)

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Apr 1, 2020

@aweits I'm fine with both approaches, so no-problem changing what is in the PR. I did what is here based on the observation that each public method that needs update is wrapped in a transaction so ensuring consistency when the transaction exits would suffice (and self._write is called at the end of a write transaction). Your approach seems more focused and I like it. I'd say let's check that there are no other paths under a write transaction that might need updates and then let's change the code as you suggest?

Copy link
Copy Markdown
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more keen on @aweits's approach here -- remove() should really be O(1), not O(n) -- so @alalazo can you refactor this to use his suggestion instead?

I think we should avoid parsing everything unless we have to.

fixes spack#15773

The performance improvements done in spack#14693 where leaving
the DB in an inconsistent state when specs were removed
from it. This PR updates the DB internal state whenever
it is written to file, but does that using a dictionary
that still resides in memory to maintain the performance
gain of spack#14693
@aweits
Copy link
Copy Markdown
Contributor

aweits commented Apr 1, 2020

I'm still concerned that I need this bit, I think I'm still missing something.

if dep._dependents.get(spec.name):

@alalazo alalazo force-pushed the fix/db_in_memory_update_on_write branch from a369f67 to 5057adc Compare April 1, 2020 18:41
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Apr 1, 2020

@aweits Pushed the change.

I'm still concerned that I need this bit,

I looked into the failing tests with a debugger and I think it's #11983 biting us again (that issue was never solved).

@aweits
Copy link
Copy Markdown
Contributor

aweits commented Apr 2, 2020

Given those constraints, what do you think of this instead?

diff --git a/lib/spack/spack/database.py b/lib/spack/spack/database.py
index ca9ed67fb5b5..6804dde906e1 100644
--- a/lib/spack/spack/database.py
+++ b/lib/spack/spack/database.py
@@ -1107,6 +1107,8 @@ def _decrement_ref_count(self, spec):
             del self._data[key]
             for dep in spec.dependencies(_tracked_deps):
                 self._decrement_ref_count(dep)
+        if rec.ref_count == 0:
+            spec._dependents.clear()
 
     def _increment_ref_count(self, spec):
         key = spec.dag_hash()

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Apr 2, 2020

@aweits Since in the JSON database we encode only ref_count and dependencies I think the two implementations you suggest are equivalent (i.e. they don't introduce any error in the DB but are equivalently wrong for specs in memory due to the bug).

I would stick with the first one because it's the one that in my opinion is closest to what it should be once the bug is fixed. In that case I think we should remove specs from dependents by hash without any condition to protect them. Something logically equivalent to:

for dep in rec.spec.dependencies(_tracked_deps):
    del dep._dependents[spec.dag_hash()]

Does that sound reasonable? If so I'll just add a verbose FIXME comment on top of that change to explain the situation in details.

@aweits
Copy link
Copy Markdown
Contributor

aweits commented Apr 2, 2020

Concur.

This is to highlight that the current implementation
is a workaround to overcome issues that are due to a
known bug in dependents computation
@alalazo alalazo requested a review from tgamblin April 2, 2020 13:48
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Apr 7, 2020

@tgamblin

@tgamblin tgamblin merged commit 7e46da7 into spack:develop Apr 12, 2020
@alalazo alalazo deleted the fix/db_in_memory_update_on_write branch April 13, 2020 07:03
tgamblin pushed a commit that referenced this pull request Apr 15, 2020
The performance improvements done in #14693 where leaving the DB in an inconsistent state when specs were removed from it. This PR updates the DB internal state whenever the DB is written to a file.

Note that we still cannot properly enumerate installed dependents, so there is a TODO in this code. Fixing that will require the dependents dictionaries in specs to be re-keyed (either by hash, or not keyed at all -- a list would do).  See #11983 for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Something wasn't working, here's a fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Warning: Inconsistent state! Dependent hash1 of hash2 not in DB

3 participants