Skip to content

Python zip action is memory hungry even if zipping not used - 80% of heap in our case #14890

@glukasiknuro

Description

@glukasiknuro

Description of the problem / feature request:

While investigating memory usage during analysis phase noticed that py_binary and py_test rules were responsible for majority of memory usage in analysis phase. Tracked it down to zip action that is always created - irrespective whether --build_python_zip is used, or what platform it runs on.

Looks like the below code iterates over all runfiles and creates strings for those, if using a lot of dependencies this adds on pretty quickly:

for (Artifact artifact : runfilesSupport.getRunfilesArtifacts().toList()) {

Commenting out the above code caused 80% reduction of heap usage in our case as reported by bazel info used-heap-size-after-gc after building codebase with --nobuild option. Even simply calling .intern() on the created String reduced heap usage by 70%, but it maybe very code-base dependent.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

See first comment.

What operating system are you running Bazel on?

Ubuntu 20.04

What's the output of bazel info release?

bazel 5

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3We're not considering working on this, but happy to review a PR. (No assignee)staleIssues or PRs that are stale (no activity for 30 days)team-Rules-PythonNative rules for Pythontype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions