Skip to content

Semaphore leaks in dataloader #11727

@ssnl

Description

@ssnl

Reported by @PetrochukM.

import torch
from torch import multiprocessing

# DEPENDANCY: This is required for ``DistributedDataParallel``
# https://pytorch.org/docs/stable/nn.html?highlight=distributeddataparallel#torch.nn.parallel.DistributedDataParallel
try:
    multiprocessing.set_start_method('spawn')
except RuntimeError:
    pass

# DEPENDANCY: This is required for ``from tqdm import tqdm``
# https://github.com/tqdm/tqdm/blob/96d8a3c3642474144f53f74331ef2172d1c39496/tqdm/_tqdm.py#L74
mp_lock = multiprocessing.RLock()

import torch
from torch.utils.data import DataLoader

if __name__ == '__main__':
    data_iterator = torch.utils.data.DataLoader([torch.tensor(i) for i in range(10)], num_workers=4)
    for batch in data_iterator:
        pass

Example warning:

/usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
  len(cache))

cc @ssnl @VitalyFedyunin @ejguan

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: dataloaderRelated to torch.utils.data.DataLoader and SamplertriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions