Use a wrapper class for cache hash to prevent cache serialization#620
Use a wrapper class for cache hash to prevent cache serialization#620hynek merged 8 commits intopython-attrs:masterfrom
Conversation
|
This is a clever solution! |
| foo_value = attr.ib() | ||
|
|
||
|
|
||
| class IncrementingHasher(object): |
There was a problem hiding this comment.
This is a nice solution to the testing need below.
llllllllll
left a comment
There was a problem hiding this comment.
This all looks good from a usage of pickle standpoint. I really like the solution here. Making the cached hash value itself special makes it much easier to use default pickle/copy behavior and reduces the number of things someone needs to worry about when implementing a custom reduce.
Regarding the performance: I think this is close to as low overhead as possible. I don't really have any ideas to make it faster.
| # actually need a constructor for None objects, we just need any | ||
| # available function that returns None. | ||
| def __reduce__(self): | ||
| return (getattr, (0, "", None)) |
There was a problem hiding this comment.
It may be more clear to create a module scope function like:
def return_none():
return Noneand then in both Python 2 and 3 make the __reduce__ return return_none, ()
There was a problem hiding this comment.
I'd be curious to know how that compares from a performance point of view.
There was a problem hiding this comment.
Here is a not super scientific check:
In [64]: class C:
...: def __reduce__(self):
...: return getattr, (0, '', None)
...:
In [65]: class D:
...: def __reduce__(self):
...: return type(None), ()
...:
In [66]: def return_none():
...: return None
...: class E:
...: def __reduce__(self):
...: return return_none, ()
...:
In [67]: len(pickle.dumps(C()))
Out[67]: 39
In [68]: len(pickle.dumps(D()))
Out[68]: 31
In [69]: len(pickle.dumps(E()))
Out[69]: 31
In [70]: %timeit pickle.loads(pickle.dumps(C()))
3.33 µs ± 14.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [71]: %timeit pickle.loads(pickle.dumps(D()))
3.12 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [72]: %timeit pickle.loads(pickle.dumps(E()))
2.62 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)There was a problem hiding this comment.
Looks like return_none is almost 20% faster? Which makes sense, since it doesn't do any function calls itself. 🤔
There was a problem hiding this comment.
Could you try to additionally define the method in the class scope please? Global lookups are relatively slow.
There was a problem hiding this comment.
I don't think defining it as a class method will work - at least in Python 2, you can't use a class method as a constructor and will get something like:
> pickle.PicklingError: Can't pickle <function _return_none at 0x7f7b13d250c8>: it's not found as __main__._return_none
But, this works to avoid the global lookup:
def _return_none():
return None
class E:
def __reduce__(self, _return_none=_return_none):
return _return_none, () There was a problem hiding this comment.
I think we probably shouldn't be optimizing based on the speed of pickle.loads(pickle.dumps()) - almost always if you're using pickle then you are using it in an RPC or writing it to a file. Maybe on-device RPC calls through a unix socket or something are fast enough that the speed of __reduce__ calls matters, but I think if we're optimizing for speed then copy.copy() is a better metric. When I time copy.copy, all of these classes give me approximately the same speed.
If we're really micro-optimizing this function, I'd say that copy.copy should be the benchmark for speed and len(pickle.dumps(x)) should be the benchmark for size.
That said, it would be unusual to use cache_hash for a particularly small object in the first place. For a big enough object that calculating the hash is expensive, you'd also expect it to take a while to pickle it, so an extra 200ns is going to be noise on the process.
It may be more clear to create a module scope function like:
I don't know what @hynek's plans are for Python 2 support, but I assume that it won't be too long before Python 2 is gone. I feel like it would preferable to use the built-in NoneType in Python 3 since that's already a builtin that does the exact same thing. I like the getattr hack because it doesn't leave any compatibility functions lying around as critical components of the system, at the cost of needing a comment and having the pickle be a little bigger.
There was a problem hiding this comment.
The biggest downside I see to the def _return_none() approach is that the pickled object will contain a reference to attr._make._return_none, meaning it'll be larger than necessary and unnecessarily fragile in the case of pickling with one version of attr and unpickling with a different one (should this ever get refactored in the future). I think the best balance of maintainability, performance, and future-proofness might be something like:
class UnpicklableInt(int):
if six.PY2:
def __reduce__(self, _none_constructor=getattr, _args=(0, "", None)):
return _none_constructor, _args
else:
def __reduce__(self, _none_constructor=type(None), _args=()):
return _none_constructor, _argsThere was a problem hiding this comment.
I like @godlygeek's version, so I updated the code to use that. I was hoping to make _none_constructor and _args keyword-only arguments in Python 3, but that's not possible in 2/3 compatible code because it's a syntax error even in the "dead code" branches. Alas.
There was a problem hiding this comment.
I don't know what @hynek's plans are for Python 2 support, but I assume that it won't be too long before Python 2 is gone.
However the rule is: optimize for Python 3. Make Python 2 work sOmEhOw.
|
So just to throw one potential monkey wrench into the mix here. It turns out that I tend to think that this is fine, since we're mainly worried about the hash cache persisting between different interpreter runs, and if a shallow copy preserves the hash cache that shouldn't be so bad, particularly because |
|
As far as my limited understand of the whole topic goes, this looks mostly good to me – modulo missing newsfragment. Is it still “WIP”? Has anyone any quibbles left? |
|
@hynek I'm going to give the documentation a once-over to make sure that this doesn't invalidate any existing documentation and to see if it makes sense to add some documentation for this behavior, but I'll remove the WIP since the "to-do" part of it is captured adequately by the checklist. |
4e75803 to
78500b9
Compare
|
@hynek I've added a changelog entry. Looking through the docs, I didn't see anywhere that the old I wasn't sure how much of this new behavior you want me to officially document outside the changelog. Should I add a Barring any documentation updates, I think this is ready to go. |
hynek
left a comment
There was a problem hiding this comment.
I'm trusting the others with the technical review, just a bit of nitpicking from my side – sorry.
Rather than attempting to remove the hash cache from the object state on deserialization or serialization, instead we store the hash cache in an object that reduces to None, thus clearing itself when pickled or copied. This fixes GH python-attrs#494 and python-attrs#613. Co-authored-by: Matt Wozniski <[email protected]>
I couldn't think of any way to make a useful and meaningful class that has no state and also has no custom __reduce__ method, so I went minimalist with it.
Previously, there was some miniscule risk of hash collision, and also it was relying on the implementation details of `pickle` (the assumption that `hash()` is never called as part of `pickle.loads`).
Co-Authored-By: Ryan Gabbard <[email protected]>
Since the cached hash value is not actually serialized in __getstate__, __setstate__ is not actually "clearing" it on deserialization - it's initializing the value to None.
This change was overshadowed by a more fundamental change in python-attrs#620.
78500b9 to
3971741
Compare
|
@hynek Thanks for the nitpicking, I admit I'm often a bit sloppy about stylistic issues (especially with Your comments prompted me to actually finish reading through the "semantic newlines" document and I realized I was doing it wrong (plus I decided to reword the changelog to read a bit naturally) - let me know if I've got it wrong. |
|
I think we're good! Except that codecov is acting up once again but hey that's rock'n'roll. Thanks everybody involved! |
When reviewing #615, @godlygeek came up with this brilliant alternative suggestion: use an
intsubclass that always pickles asNone, so that unless someone goes out of their way to capture the cache value and include it in a__reduce__or__setstate__, the cache will be cleared on any copy operation.My main fears (both of which turned out to be unsubstantiated) were:
hash()in favor of correctness in serialization/deserialization for people implementing their own__reduce__functions, which seems like a very niche concern._CacheHashValuefrom__hash__might cause other problems, like if someone were taking the hash value of something and pickling or copying it for some deliberate reason (i.e. not copying the object it's stored on, but copying the return value ofhash(x)).In terms of performance, it seems like the only performance impact this has is on the first call to
hash(), with:For
%timeit hash(A(1)), I get around 800ns with the current version of attrs and ~1-1.2μs with this change, so about a 300-400ns cost on cache misses. It's a decently high percentage of the hash for this class, but it is kinda stupid to usecache_hashfor this class if you're mostly getting cache misses anyway...For
a = A(1),%timeit hash(a), I get around 220-250ns per call with the old and new versions.Regarding point 2, returning an integer subclass here instead of an integer doesn't seem to be a problem, because
hash()casts the value of__hash__tointanyway, at least on CPython.One alternate implementation I tried out is to use a 1-slot slots class instead of an integer subclass for
_CacheHashWrapper, and unwrap the value on every call to__hash__. This was a bit slower than the integer subclass version in both the cache miss and cache hit cases. It was around 275ns in the cache hit case instead of 220-250ns.One last thing to note: There is one use case that this is likely to break, which is people who are creating an
attrsclass with one version ofattrs, pickling it, and then unpickling it in another version. Pickles made on Python 3 even with the same attrs version cannot be unpickled in Python 2 (though Python 2 pickles will probably work in Python 3). Since using pickle other than to send data between identical Python environments is unsupported and a bad idea, I think we can discount this concern.I think this is probably the best approach. If you give me the green light I'll tweak whatever documentation needs to be tweaked and add a changelog. I actually don't expect many if any documentation changes with this version since there should be nothing to caution people about - we would just need to remove any language about #494.
This PR includes the additional tests added in #615.
Fixes #613.
Fixes #494.
Pull Request Check List
.rstfiles is written using semantic newlines.changelog.d.