-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
I got a runtime error when trying to share a CUDA model with a child process.
The code I was running:
import torch
import torch.multiprocessing as mp
import torch.nn as nn
def train(model):
print(model)
# do something..
if __name__ == '__main__':
model = nn.Linear(10, 1)
model.cuda()
model.share_memory()
mp.set_start_method('spawn')
p = mp.Process(target=train, args=(model,))
p.start()
p.join()
The error traceback:
Traceback (most recent call last):
File "test.py", line 18, in <module>
p.start()
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 26, in __init__
self._launch(process_obj)
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/home/yangky/anaconda3/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "/home/yangky/anaconda3/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 179, in reduce_storage
raise RuntimeError("Cannot pickle CUDA storage; try pickling a CUDA tensor instead")
RuntimeError: Cannot pickle CUDA storage; try pickling a CUDA tensor instead
System Info
PyTorch version: 0.4.1
Is debug build: No
CUDA used to build PyTorch: 8.0.61
OS: CentOS Linux 7 (Core)
GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
CMake version: version 3.9.4
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 8.0.61
GPU models and configuration: GPU 0: GeForce GTX 1080 Ti
Nvidia driver version: 384.69
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy (1.14.3)
[pip] numpydoc (0.7.0)
[pip] torch (0.4.1)
[pip] torchvision (0.2.1)
[conda] cuda80 1.0 h205658b_0 pytorch
[conda] magma-cuda80 2.3.0 1 pytorch
[conda] pytorch 0.4.1 py36_cuda8.0.61_cudnn7.1.2_1 [cuda80] pytorch
[conda] torchvision 0.2.1 py36_1 pytorch