Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Pyinstrument causes pickling error #109

Closed
tcdejong opened this issue Oct 21, 2020 · 4 comments
Closed

BUG: Pyinstrument causes pickling error #109

tcdejong opened this issue Oct 21, 2020 · 4 comments

Comments

@tcdejong
Copy link

tcdejong commented Oct 21, 2020

Description
I'm using pyinstrument to profile a simulation. The simulation consists of a singleton class Simulation that handles ticking and similar state logic, and when it finished it pickles self for inspection later. This is required because many different (fully independent) simulations are run in parallel, and analysis only happens after completion.

Specifically, after the simulation stops it runs these lines:

        # Save pickled version of self to file
        with open(self.out_dir / f'finished_{self.config["name"]}.pickle', 'wb') as file:
                pickle.dump(self, file)

Right now I'm profiling execution of a single Simulation without the multiprocessing. Pyinstrument works fine if I comment out the pickle.dump and replace it with pass. The error only occurs when running the script with pyinstrument, not during regular execution.

Traceback:

Traceback (most recent call last):
  File "c:\users\tcdej\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\tcdej\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\tcdej\AppData\Local\Programs\Python\Python38\Scripts\pyinstrument.exe\__main__.py", line 7, in <module>
  File "c:\users\tcdej\appdata\local\programs\python\python38\lib\site-packages\pyinstrument\__main__.py", line 137, in main
    exec_(code, globs, None)
  File "simulation.py", line 540, in <module>
    sim1 = Simulation(config1).run()
  File "simulation.py", line 396, in run
    self.finish()
  File "simulation.py", line 257, in finish
    pickle.dump(self, file)
_pickle.PicklingError: Can't pickle <class '__main__.Simulation'>: attribute lookup Simulation on __main__ failed

I currently don't have time to write a minimum viable product, but will update this issue later.

System details:

  • Pyinstrument v3.2.0
  • Python 3.8.5 (standard CPython)
  • Windows 10
@tcdejong
Copy link
Author

tcdejong commented Oct 21, 2020

Reproduction

Breaking case:

This version of the code runs fine if called without pyinstrument, but crashes when profiled.

import pickle


class Simulation:
    def __init__(self, duration_in_ticks):
        self.now = 0
        self.end = duration_in_ticks

    def run(self):
        while self.now < self.end:
            print(f'Executing tick {self.now}')
            self.now += 1
        else:
            with open('out.pickle', 'wb') as file:
                pickle.dump(self, file)


if __name__ == "__main__":
    sim = Simulation(10)
    sim.run()

Terminal, no crash:

PS C:\Users\tcdej\Desktop\pystrument-mvp> python mvp.py
Executing tick 0
Executing tick 1
Executing tick 2
Executing tick 3
Executing tick 4
Executing tick 5
Executing tick 6
Executing tick 7
Executing tick 8
Executing tick 9

Pyinstrument, crash:

PS C:\Users\tcdej\Desktop\pystrument-mvp> pyinstrument mvp.py
Executing tick 0
Executing tick 1
Executing tick 2
Executing tick 3
Executing tick 4
Executing tick 5
Executing tick 6
Executing tick 7
Executing tick 8
Executing tick 9
Traceback (most recent call last):
  File "c:\users\tcdej\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main 
    return _run_code(code, main_globals, None,
  File "c:\users\tcdej\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\tcdej\AppData\Local\Programs\Python\Python38\Scripts\pyinstrument.exe\__main__.py", line 7, in <module>
  File "c:\users\tcdej\appdata\local\programs\python\python38\lib\site-packages\pyinstrument\__main__.py", line 137, in main
    exec_(code, globs, None)
  File "mvp.py", line 20, in <module>
    sim.run()
  File "mvp.py", line 15, in run
    pickle.dump(self, file)
_pickle.PicklingError: Can't pickle <class '__main__.Simulation'>: attribute lookup Simulation on __main__ failed

Modified case

This version disables the pickle.dump and then profiles fine

class Simulation:
    def __init__(self, duration_in_ticks):
        self.now = 0
        self.end = duration_in_ticks

    def run(self):
        while self.now < self.end:
            print(f'Executing tick {self.now}')
            self.now += 1
        else:
            with open('out.pickle', 'wb') as file:
                # pickle.dump(self, file)
                pass


if __name__ == "__main__":
    sim = Simulation(10)
    sim.run()
PS C:\Users\tcdej\Desktop\pystrument-mvp> pyinstrument mvp.py
Executing tick 0
Executing tick 1
Executing tick 2
Executing tick 3
Executing tick 4
Executing tick 5
Executing tick 6
Executing tick 7
Executing tick 8
Executing tick 9

  _     ._   __/__   _ _  _  _ _/_   Recorded: 14:47:01  Samples:  4    
 /_//_/// /_\ / //_// / //_'/ //     Duration: 0.005     CPU time: 0.016
/   _/                      v3.2.0

Program: mvp.py

0.004 <module>  mvp.py:1
├─ 0.002 <module>  pickle.py:1
│     [10 frames hidden]  pickle, re, sre_compile, sre_parse
│        0.001 getwidth  sre_parse.py:174
│        0.001 [self]
├─ 0.001 loads  <built-in>:0
└─ 0.001 run  mvp.py:9
   └─ 0.001 print  <built-in>:0

To view this report with different options, run:
    pyinstrument --load-prev 2020-10-21T14-47-01 [options]

@joerick
Copy link
Owner

joerick commented Nov 5, 2020

Thanks for the recreation! I've played with it a little, and I'm not sure how to fix it! I copied the 'wrapping' code from built-in Python modules cProfile and profile, and they also have the same crash!

python -m cProfile mvp.py 
[...]

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py", line 185, in <module>
    main()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py", line 178, in main
    runctx(code, globs, None, options.outfile, options.sort)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py", line 20, in runctx
    filename, sort)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/profile.py", line 62, in runctx
    prof.runctx(statement, globals, locals)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/cProfile.py", line 100, in runctx
    exec(cmd, globals, locals)
  File "mvp.py", line 20, in <module>
    sim.run()
  File "mvp.py", line 15, in run
    pickle.dump(self, file)
_pickle.PicklingError: Can't pickle <class '__main__.Simulation'>: attribute lookup Simulation on __main__ failed
python -m profile mvp.py
[...]
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/profile.py", line 602, in <module>
    main()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/profile.py", line 595, in main
    runctx(code, globs, None, options.outfile, options.sort)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/profile.py", line 99, in runctx
    return _Utils(Profile).runctx(statement, globals, locals, filename, sort)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/profile.py", line 62, in runctx
    prof.runctx(statement, globals, locals)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/profile.py", line 422, in runctx
    exec(cmd, globals, locals)
  File "mvp.py", line 20, in <module>
    sim.run()
  File "mvp.py", line 15, in run
    pickle.dump(self, file)
_pickle.PicklingError: Can't pickle <class '__main__.Simulation'>: attribute lookup Simulation on __main__ failed

perhaps in the python mvp.py case, the sys.modules['__main__'] entry is different...

@joerick
Copy link
Owner

joerick commented Nov 5, 2020

Yes, that is the difference.

dunder_main.py

import sys
print(sys.modules['__main__'])
$ pyinstrument dunder_main.py
<module '__main__' from '/Users/joerick/Projects/cibuildwheel/env/bin/pyinstrument'>
$ python dunder_main.py
<module '__main__' from 'dunder_main.py'>

I found this Stackoverflow answer that describes the issue perfectly. In that, they outline some hacky steps to fix. But actually I think this is a better solution to your problem - do as little as possible in your entrypoint file. Something like this-

import yourmodule
yourmodule.main()

I see the above as a fairly common pattern in Python generally, because things like relative imports are more sane once you're out of the entrypoint file also.

@tcdejong
Copy link
Author

tcdejong commented Nov 5, 2020

Thank your for the research!

I tested and your workaround succeeded, including pickling. Perhaps it's worth including this in a "Known Issues" or "Limitations" section?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants