Skip to content

Joblib makes pandas' data manipulation hang with multiprocessing.Pool. #980

@fx-kirin

Description

@fx-kirin

Joblib 0.14.1 hangs pandas data flow in child process with multiprocessing.Pool.map. If you use Joblib 0.13.2or remove import joblib, the problem doesn't happen.

It seems like to depend on how much data you manage in a main process and child one. It doesn't cause dead lock with less amount of data than that I used in the sample code. I'm not sure why but some blocking causing dead lock.

I put the code and data here.

#! /usr/bin/env python
# -*- coding: utf-8 -*-
# vim:fenc=utf-8

import pandas as pd
import multiprocessing
import os

import numpy as np
import pandas as pd
import joblib


def method(df_path):
    print("starting method")
    df = pd.read_pickle(df_path)
    print("pickle loaded")
    df.copy()
    print("exit method")


def main():
    df = pd.read_pickle("df")
    df.copy()

    print("Starting pool")
    pool = multiprocessing.Pool(1)
    pool.map(method, ['df' for _ in range(10)])


if __name__ == "__main__":
    main()

Environment

Ubuntu 18.04
Python 3.6.7

$ pipdeptree
asn1crypto==1.2.0
conda==4.8.0
  - pycosat [required: >=0.6.3, installed: 0.6.3]
  - requests [required: >=2.12.4, installed: 2.22.0]
    - certifi [required: >=2017.4.17, installed: 2019.11.28]
    - chardet [required: >=3.0.2,<3.1.0, installed: 3.0.4]
    - idna [required: >=2.5,<2.9, installed: 2.8]
    - urllib3 [required: >=1.21.1,<1.26,!=1.25.1,!=1.25.0, installed: 1.25.7]
  - ruamel-yaml [required: >=0.11.14, installed: 0.11.14]
conda-package-handling==1.6.0
  - six [required: Any, installed: 1.13.0]
ipdb==0.12.3
  - ipython [required: >=5.1.0, installed: 7.10.2]
    - backcall [required: Any, installed: 0.1.0]
    - decorator [required: Any, installed: 4.4.1]
    - jedi [required: >=0.10, installed: 0.15.2]
      - parso [required: >=0.5.2, installed: 0.5.2]
    - pexpect [required: Any, installed: 4.7.0]
      - ptyprocess [required: >=0.5, installed: 0.6.0]
    - pickleshare [required: Any, installed: 0.7.5]
    - prompt-toolkit [required: >=2.0.0,<3.1.0,!=3.0.1,!=3.0.0, installed: 3.0.2]
      - wcwidth [required: Any, installed: 0.1.7]
    - pygments [required: Any, installed: 2.5.2]
    - setuptools [required: >=18.5, installed: 42.0.2.post20191203]
    - traitlets [required: >=4.2, installed: 4.3.3]
      - decorator [required: Any, installed: 4.4.1]
      - ipython-genutils [required: Any, installed: 0.2.0]
      - six [required: Any, installed: 1.13.0]
  - setuptools [required: Any, installed: 42.0.2.post20191203]
mkl-fft==1.0.15
  - numpy [required: Any, installed: 1.17.4]
mkl-random==1.1.0
  - numpy [required: Any, installed: 1.17.4]
mkl-service==2.3.0
  - six [required: Any, installed: 1.13.0]
mysqlclient==1.4.6
packaging==19.2
  - pyparsing [required: >=2.0.2, installed: 2.4.5]
  - six [required: Any, installed: 1.13.0]
pandas==0.25.3
  - numpy [required: >=1.13.3, installed: 1.17.4]
  - python-dateutil [required: >=2.6.1, installed: 2.8.1]
    - six [required: >=1.5, installed: 1.13.0]
  - pytz [required: >=2017.2, installed: 2019.3]
pipdeptree==0.13.2
  - pip [required: >=6.0.0, installed: 19.3.1]
pycrypto==2.6.1
pynvim==0.4.0
  - greenlet [required: Any, installed: 0.4.15]
  - msgpack [required: >=0.5.0, installed: 0.6.2]
pyOpenSSL==19.1.0
  - cryptography [required: >=2.8, installed: 2.8]
    - cffi [required: >=1.8,!=1.11.3, installed: 1.13.2]
      - pycparser [required: Any, installed: 2.19]
    - six [required: >=1.4.1, installed: 1.13.0]
  - six [required: >=1.5.2, installed: 1.13.0]
PySocks==1.7.1
PyYAML==3.12
rpdb==0.1.6
scikit-learn==0.21.3
  - joblib [required: >=0.11, installed: 0.14.1]
  - numpy [required: >=1.11.0, installed: 1.17.4]
  - scipy [required: >=0.17.0, installed: 1.1.0]
SQLAlchemy==1.3.12
tqdm==4.40.2
wheel==0.33.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions