-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Discovered in course of Iris#3628 that Iris graphics tests have just broken.
After some investigation, determined that this is due to changed results from key pillow (aka PIL) operations that imagehash.phash() relies on, since pillow v7.0.0 came out.
- the whole usage of iris graphics tests (
IrisTest.check_graphic), theiris/tests/idiff.pycommand, and this repo currently relies on the idea that an image produces a unique stable hash value, determined by 'imagehash.phash(image)', and we use that hash value (in hex ) as the name for the image file stored in test-iris-imagehash - the hashes of all our stored image files have in fact remained stable since we first developed test-iris-imagehash.
- E.G. at this old commit, dated 2018-4-11, the Travis log shows we were initially operating with pillow version 3.2.1
- .. whereas in this most recent successful commit, 2019-10-16 we were still getting the same values, with pillow version 6.2.0
- but this has now failed with pillow v7.0.0 : some stored images are now producing different hashes
- this relationship is tested by test-iris-imagehash/run_test.py : It checks that all the files compute a hash equal to their filename. Until fixed, this will break all PR tests from now on
On reflection, fixing this is all a bit tricky : There is no guarantee or expectation in pillow (aka PIL) that the results of operations would be exactly the same between different package versions. In fact, even imagehash itself does not make such a claim : it provides provide toleranced image compare by calculating hashes that can be differenced, but this does not mean that the hash values themselves are necessarily portable from one version to another.
We already had one such problem, where hash numbers changes in imagehash version 4, as fixed in #14. Arguably, we have just been "lucky" since then (!) More recently, imagehash#32 raises a similar problem with possible pillow changes (that presumably worked out ok).
Temporarily fixed in Iris by pinning : SciTools/iris#3630
A "proper" fix would involve changes both here and in Iris.
One possible way ...
- associate hashes with specific dependency versions (imagehash, pillow, possibly scipy?)
- decouple storage filenames from the hash info (or possibly use softlinks as was done in Add imagehash v4 links #14)
- make sure Iris only uses hashes appropriate to the current installed library(s) when calculating compare distances
This assumes we need to retain function with old versions of dependencies, and implies we must know hashes for all files at multiple versions.
Another way is only to support a latest version (fixed or pinned), and insist only that is useable.
This means when problems occur we would rename all the image files, and change Iris imagerepo.json. Tests will then no longer work with older versions of pillow (+poss others).