pickle: pad numpy bytes to ensure correct alignment with memmap by aabadie · Pull Request #570 · joblib/joblib

aabadie · 2017-11-09T16:45:49Z

This is an attempt to fix #563.

The idea is to provide the same strategy that numpy does but in the pickle file, since the array bytes are stored inlined in the pickle stream.

For the moment, it doesn't work (I'm even not sure if my test function is valid...).

aabadie · 2017-11-09T16:49:17Z

If we keep a similar strategy in the end, I'll also have to provide new test data (for testing backward compat).

codecov · 2017-11-09T16:54:06Z

Codecov Report

Merging #570 into master will increase coverage by 0.05%.
The diff coverage is 96.15%.

@@            Coverage Diff            @@
##           master    #570      +/-   ##
=========================================
+ Coverage   94.24%   94.3%   +0.05%     
=========================================
  Files          39      39              
  Lines        5008    5059      +51     
=========================================
+ Hits         4720    4771      +51     
  Misses        288     288

Impacted Files	Coverage Δ
joblib/test/test_numpy_pickle.py	`99.06% <100%> (+0.01%)`	⬆️
joblib/numpy_pickle.py	`97.89% <95.23%> (-0.58%)`	⬇️
joblib/_parallel_backends.py	`93.45% <0%> (-1.41%)`	⬇️
joblib/test/test_memory.py	`99.36% <0%> (+0.42%)`	⬆️
joblib/memory.py	`94.13% <0%> (+0.73%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update efa87af...5f2ce2f. Read the comment docs.

aabadie · 2017-11-13T15:35:24Z

@ogrisel, any idea why this approach doesn't work ?

aabadie · 2017-11-13T19:49:42Z

Ok, I think I made some progress: the dumps generated from the NumpyArrayWrapper also contains the pickle header which is already taken into account by the position in the pickle stream. I have to subtract this pickle 'header' and pickle closing bytes from the computed position.

I computed 22 bytes to remove but actually only 19 is working. I still can't really explain why...

And it fails on Appveyor, maybe because it's a 32 bit version?, or something like this.

lesteve · 2017-11-14T05:58:12Z

Hmmm it looks like Travis Python 2.7 is broken as well.

aabadie · 2017-11-14T07:18:46Z

it looks like Travis Python 2.7 is broken as well.

yes, I wanted to have a more complex test, just to be sure. There's something with the pickle protocol version I think

aabadie · 2017-11-15T21:13:55Z

Hmmm it looks like Travis Python 2.7 is broken as well.

Not the case anymore !!! And Windows seems to work as well :)

I finally use another custom pickler to determine the actual size of the NumpyArrayWrapper in the stream. This was the easy part. Because it was still not working.

The main problem was coming from pickle that does some optimisation when the same object has to be pickled: it uses an internal memo pattern and reuse in place of the full pickled object. I fixed that calling self.clear_memo() after pickling the NumpyArrayWrapper.
It might not be right solution and it probably has side effects (mainly on performance I think).

The last trick was to use np.int8 instead of a basic python int for the introducedalignment_padding attribute of the array wrapper. This is because Pickle is again doing optimisation and the stream generated after setting this attribute was different from the stream generated from the default constructor (0).

I know I introduced some ugly hacks to solve the padding issue but that the best I could came up (for the moment).

aabadie · 2017-11-15T21:16:16Z

joblib/numpy_pickle.py

        self.order = order
        self.dtype = dtype
        self.allow_mmap = allow_mmap
+        try:


Might not be the best place to do that. Good ideas are welcome

aabadie · 2017-11-16T09:19:21Z

I tried locally to play with different values of pickle protocol and it hardly fails with values different from the default ones for both python 2 & 3.

lesteve · 2017-11-16T14:38:30Z

it hardly fails with values different from the default ones for both python 2 & 3.

What do you mean by "hardly fails"?

lesteve · 2017-11-16T14:41:50Z

joblib/test/test_numpy_pickle.py

+    for memmap in l_reloaded:
+        assert isinstance(memmap, np.memmap)
+        np.testing.assert_array_equal(arr, memmap)
+        assert memmap.ctypes.data % 8 == 0


Can you check .flags.aligned as well?

…p (WIP)

aabadie · 2017-11-16T14:53:17Z

What do you mean by "hardly fails"?

I mean I have a weird issue with python 3 and pickle.HIGHEST_PROTOCOL (4):


    @with_numpy
    def test_memmap_alignment_padding(tmpdir):
        # Test that memmaped arrays returned by numpy.load are correctly aligned
    
        fname = tmpdir.join('test.mmap').strpath
        arr = np.random.randn(2)
    
        l = [arr, arr, arr, arr]
    
        numpy_pickle.dump(l, fname, protocol=pickle.HIGHEST_PROTOCOL)
>       l_reloaded = numpy_pickle.load(fname, mmap_mode='r')

joblib/test/test_numpy_pickle.py:931: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
joblib/numpy_pickle.py:645: in load
    obj = _unpickle(fobj, filename, mmap_mode)
joblib/numpy_pickle.py:575: in _unpickle
    obj = unpickler.load()
../../../anaconda3/lib/python3.5/pickle.py:1039: in load
    dispatch[key[0]](self)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <joblib.numpy_pickle.NumpyUnpickler object at 0x7f7482ad2f28>

    def load_reduce(self):
        stack = self.stack
        args = stack.pop()
        func = stack[-1]
>       stack[-1] = func(*args)
E       TypeError: 'str' object is not callable

../../../anaconda3/lib/python3.5/pickle.py:1396: TypeError

The generated pickle is invalid but it's hard to understand why.

lesteve · 2017-11-17T07:54:08Z

I mean I have a weird issue with python 3 and pickle.HIGHEST_PROTOCOL (4):

Can you add a test that fails? Basically a test that tests on all protocols looks like a good idea.

…p (WIP)

aabadie · 2017-11-17T08:38:46Z

Just to give more details when playing with protocol versions:

python3 (versions are from 0 to 4): memmap is not aligned with version 2, pickle broken with version 4
python2 (versions are from 0 to 2): memmap not aligned with version 0

So 3 cases fail.

aabadie · 2017-11-17T09:04:09Z

@lesteve, just pushed an adapted version of the test. It should be possible to find the correct alignment for (python 2, protocol 0) and (python 3, protocol 2), but for (python 3, protocol 4), I have no idea...

lesteve · 2017-11-17T10:09:52Z

The main problem was coming from pickle that does some optimisation when the same object has to be pickled: it uses an internal memo pattern and reuse in place of the full pickled object. I fixed that calling self.clear_memo() after pickling the NumpyArrayWrapper.

Can you comment a bit more on this? I know about the pickle memoization but I don't get why clear_memo is needed.

aabadie · 2017-11-17T10:23:16Z

Can you comment a bit more on this? I know about the pickle memoization but I don't get why clear_memo is needed.

This fixes the problem when you have multiple object of the same type to pickle. The actual test uses a list of numpy arrays and shows this problem.

The first time, pickle will write the full pickled stream of the object and the next times, pickle will check in the memo for references of the same object id. If the object id is found in memo, pickle uses it and the resulting pickle is different => the size is different. Since we are comparing to an uniquely pickled object (using NumpyArrayWrapperPickler), the computed padding is wrong. If I clear the memo, I'm sure the object will never be found => always the same length.

Maybe I should just drop the key of numpy array wrapper id only from the memo instead of flushing it entirely.

aabadie · 2017-11-17T12:08:23Z

@lesteve, do you think it's worth supporting pickle protocol < 2 for py2 and py3? If not, we can concentrate on py3 issues.

lesteve · 2017-11-17T14:43:21Z

py3 issues are probably more of a priority I would say.

I have to say I don't manage to follow everything ... I can't wrap my head around why the memoization is a problem, it feels it should "just work". Which object is memoized, the array_wrapper or the array?

aabadie · 2017-11-17T14:47:56Z

Which object is memoized, the array_wrapper or the array?

Every kind of object is memorized. and that's the problem. One cannot just remove the wrapper from memoize since, for example, the np.dtype will be inserted in during pickling of the wrapper itself !

aabadie · 2017-11-17T14:50:08Z

I dumped the memo just to have a look at the content. In pickle.py there's specific save code that depends on the protocol. I'm trying to better understand it.

lesteve · 2017-11-17T15:04:26Z

Ah but that's because of your solution with your separate pickler for computing the length of the array_wrapper that is going to be written right? Can you not set the memo of the dumb pickler to the main one?

aabadie · 2017-11-17T15:37:11Z

Can you not set the memo of the dumb pickler to the main one?

Do you mean sharing the same memo instance between the 2 ? I can try.

lesteve · 2017-11-17T16:22:52Z

Do you mean sharing the same memo instance between the 2 ? I can try.

Yep this is what I had in mind. Seems hacky but worth a try. There may be complications that need to be taken into account. You may need to remove the array wrapper from the memo after computing the expected length because the main pickler may use the memoized array_wrapper if you are unlucky and the offset is zero (or use -1 as the default padding offset).

…p (WIP)

aabadie · 2017-11-20T08:49:50Z

Seems hacky but worth a try

I tried to share the same instance of memo but it was not working as expected (don't remember exactly why, sorry)

You may need to remove the array wrapper from the memo after computing the expected length because the main pickler may use the memoized array_wrapper if you are unlucky and the offset is zero (or use -1 as the default padding offset).

I came up with another approach which is to set a copy of the numpy pickler memo to the wrapper pickler. Then I ensure the future pickled wrapper will have the same size.
But still, it doesn't fully work with a list containing more than 2 arrays... The sizes are nearly the same but there is still something that changes between the 2 and I couldn't find what it is (yet).

Also added another case: using dict of arrays. This kind of data also doesn't work...

The issue with protocol 4 is now fixed, it was related to the new framing mecanism.

lesteve · 2017-11-20T13:25:23Z

You seem to be on the right track, keep pushing, sorry if I am not of much help on the low-level details ...

aabadie · 2019-03-05T08:59:25Z

I don't think I'll work again on this and I don't think it's worth, so I'm closing it. Feel free to reopen or provided a similar change in another PR.

aabadie force-pushed the fix_memmap_aligned branch from d2ed695 to 1da1f42 Compare November 13, 2017 19:52

aabadie commented Nov 15, 2017

View reviewed changes

aabadie changed the title ~~pickle: pad numpy bytes to ensure correct alignment with memmap (WIP)~~ pickle: pad numpy bytes to ensure correct alignment with memmap Nov 15, 2017

aabadie force-pushed the fix_memmap_aligned branch from 5dcc978 to 3aa3bc3 Compare November 15, 2017 21:26

pickle: pad numpy bytes to ensure correct alignment with memmap (WIP)

4cdc10c

aabadie force-pushed the fix_memmap_aligned branch from 3aa3bc3 to 4cdc10c Compare November 16, 2017 09:10

aabadie added the bug label Nov 16, 2017

aabadie mentioned this pull request Nov 16, 2017

Alignment issue when passing arrays though mmap #563

Closed

lesteve reviewed Nov 16, 2017

View reviewed changes

fixup! pickle: pad numpy bytes to ensure correct alignment with memma…

53c04e4

…p (WIP)

fixup! pickle: pad numpy bytes to ensure correct alignment with memma…

6d9e374

…p (WIP)

aabadie force-pushed the fix_memmap_aligned branch 3 times, most recently from ac0bcaf to adb7c4d Compare November 18, 2017 23:31

fixup! pickle: pad numpy bytes to ensure correct alignment with memma…

09f16d2

…p (WIP)

aabadie force-pushed the fix_memmap_aligned branch from adb7c4d to 09f16d2 Compare November 18, 2017 23:36

aabadie added 3 commits November 19, 2017 10:44

fixup! pickle: pad numpy bytes to ensure correct alignment with memma…

35f48d4

…p (WIP)

fixup! pickle: pad numpy bytes to ensure correct alignment with memma…

3cd88b0

…p (WIP)

Add failing cases: multiple (>2) arrays in list and dict

5f2ce2f

aabadie force-pushed the fix_memmap_aligned branch from a746a04 to 5f2ce2f Compare November 19, 2017 11:53

ColCarroll mentioned this pull request Dec 15, 2017

TypeError from Theano when multiprocessing: numpy.array not aligned pymc-devs/pymc#2640

Closed

aabadie closed this Mar 5, 2019

lesteve mentioned this pull request Feb 3, 2022

Make sure arrays are bytes aligned in joblib pickles #1254

Merged

Conversation

aabadie commented Nov 9, 2017

Uh oh!

aabadie commented Nov 9, 2017

Uh oh!

codecov bot commented Nov 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

aabadie commented Nov 13, 2017

Uh oh!

aabadie commented Nov 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve commented Nov 14, 2017

Uh oh!

aabadie commented Nov 14, 2017

Uh oh!

aabadie commented Nov 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aabadie Nov 15, 2017

Choose a reason for hiding this comment

Uh oh!

aabadie commented Nov 16, 2017

Uh oh!

lesteve commented Nov 16, 2017

Uh oh!

lesteve Nov 16, 2017

Choose a reason for hiding this comment

Uh oh!

aabadie Nov 16, 2017

Choose a reason for hiding this comment

Uh oh!

aabadie commented Nov 16, 2017

Uh oh!

lesteve commented Nov 17, 2017

Uh oh!

aabadie commented Nov 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aabadie commented Nov 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve commented Nov 17, 2017

Uh oh!

aabadie commented Nov 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aabadie commented Nov 17, 2017

Uh oh!

lesteve commented Nov 17, 2017

Uh oh!

aabadie commented Nov 17, 2017

Uh oh!

aabadie commented Nov 17, 2017

Uh oh!

lesteve commented Nov 17, 2017

Uh oh!

aabadie commented Nov 17, 2017

Uh oh!

lesteve commented Nov 17, 2017

Uh oh!

aabadie commented Nov 20, 2017

Uh oh!

lesteve commented Nov 20, 2017

Uh oh!

aabadie commented Mar 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Nov 9, 2017 •

edited

Loading

aabadie commented Nov 13, 2017 •

edited

Loading

aabadie commented Nov 15, 2017 •

edited

Loading

aabadie commented Nov 17, 2017 •

edited

Loading

aabadie commented Nov 17, 2017 •

edited

Loading

aabadie commented Nov 17, 2017 •

edited

Loading