Skip to content

specs: use lazy lexicographic comparison instead of key_ordering#21618

Merged
becker33 merged 3 commits intodevelopfrom
lazy-lexicographic-spec-comparison
Mar 31, 2021
Merged

specs: use lazy lexicographic comparison instead of key_ordering#21618
becker33 merged 3 commits intodevelopfrom
lazy-lexicographic-spec-comparison

Conversation

@tgamblin
Copy link
Copy Markdown
Member

@tgamblin tgamblin commented Feb 11, 2021

We have been using the @llnl.util.lang.key_ordering decorator for specs and most of their components. This leverages the fact that in Python, tuple comparison is lexicographic. It allows you to implement a _cmp_key method on your class, and have __eq__, __lt__, etc. implemented automatically using that key. For example, you might use tuple keys to implement comparison, e.g.:

class Widget:
    # author implements this
    def _cmp_key(self):
        return (
            self.a,
            self.b,
            (self.c, self.d),
            self.e
        )

    # operators are generated by @key_ordering
    def __eq__(self, other):
        return self._cmp_key() == other._cmp_key()

    def __lt__(self):
        return self._cmp_key() < other._cmp_key()

    # etc.

The issue there for simple comparators is that we have to build the tuples and we have to generate all the values in them up front. When implementing comparisons for large data structures, this can be costly.

This PR replaces @key_ordering with a new decorator, @lazy_lexicographic_ordering. Lazy lexicographic comparison maps the tuple comparison shown above to generator functions. Instead of comparing based on pre-constructed tuple keys, users of this decorator can compare using elements from a generator. So, you'd write:

@lazy_lexicographic_ordering
class Widget:
    def _cmp_iter(self):
        yield a
        yield b
        def cd_fun():
            yield c
            yield d
        yield cd_fun
        yield e

    # operators are added by decorator (but are a bit more complex)

There are no tuples that have to be pre-constructed, and the generator does not have to complete. Instead of tuples, we simply make functions that lazily yield what would've been in the tuple. If a yielded value is a callable, the comparison functions will call it and recursively compare it. The comparator just walks the data structure like you'd expect it to.

The @lazy_lexicographic_ordering decorator handles the details of implementing comparison operators, and the Widget implementor only has to worry about writing _cmp_iter, and making sure the elements in it are also comparable.

Using this PR shaves another 1.5 sec off the runtime of spack buildcache list, and it also speeds up Spec comparison by about 30%. The runtime improvement comes mostly from two things:

  1. lazily stopping the comparison as soon as possible (e.g., many specs just have different names, which is the firs thing the lazy generators return)
  2. avoiding the use of hash() in _cmp_iter() (it was used in _cmp_key() before)

@tgamblin
Copy link
Copy Markdown
Member Author

Seems there's a bug in here somewhere -- I might not get to fixing that until later today.

Copy link
Copy Markdown
Member

@alalazo alalazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments from a first read of the PR. I'll test drive it asap.

def _cmp_key(self):
return tuple(sorted(self.values()))
def _cmp_iter(self):
for _, v in sorted(self.items()):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the ordering is arbitrary for us, but want to note that this is different from the previous one (the ordering is done here on (key, value) pairs, while before it was just the values)

tuples. The issue there for simple comparators is that we have to
bulid the tuples *and* we have to generate all the values in them up
front. When implementing comparisons for large data structures, this
can be costly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using this whenever possible, then? Is there a use case where this doesn't apply?

@tgamblin tgamblin requested a review from becker33 February 11, 2021 21:32
@tgamblin tgamblin force-pushed the lazy-lexicographic-spec-comparison branch 3 times, most recently from e649850 to 93231c0 Compare February 11, 2021 23:54
@tgamblin tgamblin force-pushed the lazy-lexicographic-spec-comparison branch from 93231c0 to ba31dd0 Compare March 13, 2021 08:52
@tgamblin tgamblin force-pushed the lazy-lexicographic-spec-comparison branch 5 times, most recently from f7cc208 to 670da01 Compare March 22, 2021 09:05
@tgamblin
Copy link
Copy Markdown
Member Author

tgamblin commented Mar 22, 2021

@alalazo @becker33 @eugeneswalker: this is ready for another look. With the latest E4S build cache (which has 37k specs), it significantly improves performance for spack buildcache list:

develop:

       42.68 real        35.00 user         0.75 sys

lazy-lexicographic-spec-comparison:

       24.71 real        23.80 user         0.46 sys

So that shaves > 40% of the time off.

The key is really two things. First, __lt__ is faster b/c of the laziness -- specs that are actually less than others bail as early as possible. For __eq__, or rather for specs that are equal, things are trickier, because we do not want __eq__ to have to traverse an entire spec. I'm leveraging the dag hash like this:

    def __hash__(self):
        # If the spec is concrete, we leverage the DAG hash and just use
        # a 64-bit prefix of it. The DAG hash has the advantage that it's
        # computed once per concrete spec, and it's saved -- so if we
        # read concrete specs we don't need to recompute the whole hash.
        # This is good for large, unchanging specs.
        if self.concrete:
            if not self._dunder_hash:
                self._dunder_hash = self.dag_hash_bit_prefix(64)
            return self._dunder_hash

        # This is the normal hash for lazy_lexicographic_ordering. It's
        # slow for large specs because it traverses the whole spec graph,
        # so we hope it only runs on abstract specs, which are small.
        return hash(lang.tuplify(self._cmp_iter))

So basically, I think this gets us the best of both worlds -- lazy comparison for abstract specs that are small, and leveraging precomputed hashes for specs that are large (which are pretty much all the concrete ones).

@tgamblin tgamblin force-pushed the lazy-lexicographic-spec-comparison branch 3 times, most recently from df3e379 to eef9b11 Compare March 22, 2021 20:12
eugeneswalker
eugeneswalker previously approved these changes Mar 24, 2021
Copy link
Copy Markdown
Contributor

@eugeneswalker eugeneswalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you.

Copy link
Copy Markdown
Member

@becker33 becker33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only request is removing a vestigial comment. Otherwise LGTM

x1 = Spec('a')
x1.concretize()
x1._hash = 'xy'
x1._hash = 'xyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Copy Markdown
Member Author

@tgamblin tgamblin Mar 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that Spec.__hash__ now uses self.dag_hash_bit_prefix(64) means that there need to be at least 64 bits of hash data to get the prefix. That's 13 base32 characters, so I just went ahead and made these hashes full length. Every real hash will be 32 characters long (160 bits for SHA1) so I made them proper stand-ins.

@alalazo alalazo dismissed their stale review March 25, 2021 16:18

Review outdated, comments have been addressed

We have been using the `@llnl.util.lang.key_ordering` decorator for specs
and most of their components. This leverages the fact that in Python,
tuple comparison is lexicographic. It allows you to implement a
`_cmp_key` method on your class, and have `__eq__`, `__lt__`, etc.
implemented automatically using that key. For example, you might use
tuple keys to implement comparison, e.g.:

```python
class Widget:
    # author implements this
    def _cmp_key(self):
        return (
            self.a,
            self.b,
            (self.c, self.d),
            self.e
        )

    # operators are generated by @key_ordering
    def __eq__(self, other):
        return self._cmp_key() == other._cmp_key()

    def __lt__(self):
        return self._cmp_key() < other._cmp_key()

    # etc.
```

The issue there for simple comparators is that we have to bulid the
tuples *and* we have to generate all the values in them up front. When
implementing comparisons for large data structures, this can be costly.

This PR replaces `@key_ordering` with a new decorator,
`@lazy_lexicographic_ordering`. Lazy lexicographic comparison maps the
tuple comparison shown above to generator functions. Instead of comparing
based on pre-constructed tuple keys, users of this decorator can compare
using elements from a generator. So, you'd write:

```python
@lazy_lexicographic_ordering
class Widget:
    def _cmp_iter(self):
        yield a
        yield b
        def cd_fun():
            yield c
            yield d
        yield cd_fun
        yield e

    # operators are added by decorator (but are a bit more complex)

There are no tuples that have to be pre-constructed, and the generator
does not have to complete. Instead of tuples, we simply make functions
that lazily yield what would've been in the tuple. If a yielded value is
a `callable`, the comparison functions will call it and recursively
compar it. The comparator just walks the data structure like you'd expect
it to.

The ``@lazy_lexicographic_ordering`` decorator handles the details of
implementing comparison operators, and the ``Widget`` implementor only
has to worry about writing ``_cmp_iter``, and making sure the elements in
it are also comparable.

Using this PR shaves another 1.5 sec off the runtime of `spack buildcache
list`, and it also speeds up Spec comparison by about 30%. The runtime
improvement comes mostly from *not* calling `hash()` `_cmp_iter()`.
Since `lazy_lexicographic_ordering` handles `None` comparison for us, we
don't need to adjust the spec comparators to return empty strings or
other type-specific empty types. We can just leverage the None-awareness
of `lazy_lexicographic_ordering`.

- [x] remove "or ''" from `_cmp_iter` in `Spec`
- [x] remove setting of `self.namespace` to `''` in `MockPackage`
@tgamblin tgamblin force-pushed the lazy-lexicographic-spec-comparison branch from eef9b11 to 420177f Compare March 28, 2021 00:22
@tgamblin
Copy link
Copy Markdown
Member Author

@becker33:comments addressed

@becker33 becker33 merged commit a1d9a56 into develop Mar 31, 2021
@becker33 becker33 deleted the lazy-lexicographic-spec-comparison branch March 31, 2021 21:39
@tgamblin
Copy link
Copy Markdown
Member Author

@alalazo FYI

@alalazo
Copy link
Copy Markdown
Member

alalazo commented Apr 1, 2021

Thanks I'll update #21683

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants