Skip to content

Running requests in parallel from a zip archive can create race condition when unpacking the cacerts.pem file #5223

@tru

Description

@tru

We ran into a really crazy case and I understand this is a edge case but it might be worth fixing.

We started to see this backtrace in our CI:

[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/api.py", line 75, in get
[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/api.py", line 60, in request
[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/sessions.py", line 533, in request
[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/sessions.py", line 646, in send
[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/adapters.py", line 416, in send
[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/adapters.py", line 224, in cert_verify
[2019-10-08T08:01:13.655Z]   File "/data/jenkins/postproc-asustor/204934530/conan/lib/conan-1.4.4-66-linux-py36.pyz/requests/utils.py", line 254, in extract_zipped_paths
[2019-10-08T08:01:13.655Z]   File "/usr/local/pyenv/versions/3.6.4/lib/python3.6/zipfile.py", line 1484, in extract
[2019-10-08T08:01:13.655Z]     return self._extract_member(member, path, pwd)
[2019-10-08T08:01:13.655Z]   File "/usr/local/pyenv/versions/3.6.4/lib/python3.6/zipfile.py", line 1547, in _extract_member
[2019-10-08T08:01:13.655Z]     os.makedirs(upperdirs)
[2019-10-08T08:01:13.655Z]   File "/usr/local/pyenv/versions/3.6.4/lib/python3.6/os.py", line 220, in makedirs
[2019-10-08T08:01:13.655Z]     mkdir(name, mode)
[2019-10-08T08:01:13.655Z] FileExistsError: [Errno 17] File exists: '/data/jenkins/postproc-asustor/204934530/_temp/certifi'

After a lot of confusion I think I understand this bug now. We distribute our python dependencies (including requests) as a pyz (created with zipapp) and the consumer in this case calls request inside a ThreadPool. requests then have logic to unpack cacerts.pem into the temp directory, but there is no race protection. So our parallel threads stepped on each other toes here when unpacking this file.

We solved this by a simple get call before starting the parallel invocation. But I think it might be worth fixing because it's very confusing.

Expected Result

Not having the cacert being overwritten :)

Actual Result

Exception above

Reproduction Steps

create a pyz with zipapp of requests and it's dependencies (certifi)

import requests                                                                                                                                                                                         from concurrent.futures import ThreadPoolExecutor

urls = ("https://github.com", "https://github.com", "https://github.com", "https://github.com")

def get(url):
    print(f"Getting {url}")
    requests.get(url)

with ThreadPoolExecutor(5) as pool:
      pool.map(get, urls)

note that since it's a race it can trigger or not trigger a lot.

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": "2.1.4"
  },
  "idna": {
    "version": "2.8"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.8"
  },
  "platform": {
    "release": "4.19.72-microsoft-standard",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "1010100f",
    "version": "17.5.0"
  },
  "requests": {
    "version": "2.22.0"
  },
  "system_ssl": {
    "version": "1010100f"
  },
  "urllib3": {
    "version": "1.25.6"
  },
  "using_pyopenssl": true
}

This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions