Skip to content

Reuse input map and merkle tree for tools with lots of inputs #10875

@JaredNeil

Description

@JaredNeil

Description of the feature request:

Create a separate input map and merkle tree for each tool used in an action, then combine those to calculate the final action digest. Each tool's input map and merkle tree should be reused between actions that depend on the same tool.

Feature requests: what underlying problem are you trying to solve with this feature?

These two function calls cause remote caching to be really slow for actions with lots of inputs. Having a way to re-use the parts that don't change between multiple actions that use the same tool could make these significantly faster.
Tools written in JavaScript will often have a high number of runfiles (>10,000) because they depend on the node_modules folder. This is also sometimes true of Python tools.

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

In our specific case, we have an action per source file that uses a nodejs_binary tool. So we have 15,000 actions with the same ~10,000 runfiles for the tool and 1 unique input file. Each of these takes ~200ms to calculate the cache key, so that's 50 CPU-minutes of work just to check if the actions are cached.

Screenshot of profile showing Merkle Tree build times

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2We'll consider working on this in future. (Assignee optional)team-Remote-ExecIssues and PRs for the Execution (Remote) teamtype: feature request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions