OSError: [Errno 4] Interrupted system call

When I use pytorch to train a  small network in a multi-user Ubuntu 16.04 + cuda 8.0 + python2.7, I came across the OSError. It happens accidentally. Some time within 1 epoch, some time 4 epoch. The log is:

> Traceback (most recent call last):
  File "autoencodertrain.py", line 43, in <module>
    data = dataloader.get_next_iter()
  File "/home/pytorch/codes/gan/xgan/data/newdata_loader.py", line 45, in get_next_iter
    dataB  = self.dataLoaderB.__iter__().next()
  File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 254, in __next__
    idx, batch = self._get_batch()
  File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 233, in _get_batch
    return self.data_queue.get()
  File "/usr/lib/python2.7/multiprocessing/queues.py", line 378, in get
    return recv()
  File "/usr/local/lib/python2.7/dist-packages/torch/multiprocessing/queue.py", line 22, in recv
    return pickle.loads(buf)
  File "/usr/lib/python2.7/pickle.py", line 1388, in loads
    return Unpickler(file).load()
  File "/usr/lib/python2.7/pickle.py", line 864, in load
    dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1139, in load_reduce
    value = func(*args)
  File "/usr/local/lib/python2.7/dist-packages/torch/multiprocessing/reductions.py", line 68, in rebuild_storage_fd
    fd = multiprocessing.reduction.rebuild_handle(df)
  File "/usr/lib/python2.7/multiprocessing/reduction.py", line 170, in rebuild_handle
    new_handle = recv_handle(conn)
  File "/usr/lib/python2.7/multiprocessing/reduction.py", line 85, in recv_handle
    return _multiprocessing.recvfd(conn.fileno())
OSError: [Errno 4] Interrupted system call

Can anyone help solve this problem. Or give some hints on how to solve it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OSError: [Errno 4] Interrupted system call #4220

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OSError: [Errno 4] Interrupted system call #4220

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions