Skip to content

Cache hit for remote execution shows Scheduling state while downloading #13531

@brentleyjones

Description

@brentleyjones

Description of the problem / feature request:

When an action is a cache hit, instead of showing some sort of running state in the UI, it shows as [Sched] the entire time. This is confusing, as it seems the action is stalled out waiting for an executor, when in reality it's making progress.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

While using remote execution, build, from clean, a fully cached action graph. All of the actions will look like this:

    [Sched] //Modules/Illustration:IllustrationResources; 113s
    [Sched] Bundling IconographyResources; 42s
    [Sched] Generating derived files for Swift module Kronos; 12s
    [Sched] Generating derived files for Swift module CollectionDiffing; 12s
    [Sched] Generating derived files for Swift module SwiftProtobuf; 12s
    [Sched] Compiling Swift module CollectionDiffing; 12s
    [Sched] Generating derived files for Swift module Yams; 12s
    [Sched] Compiling Swift module Yams; 12s ..

When not using remote, it looks like this:

    @com_google_protobuf_protoc_macos_x86_64//:protoc; 15s remote-cache
    Compiling src/google/protobuf/struct.pb.cc [for host]; 14s remote-cache
    //Modules/UserProfilesUI:UserProfilesUIResources; 14s remote-cache
    //Modules/DigitCodeUI:DigitCodeUIResources; 14s remote-cache
    //Modules/RidePasses:RidePassesResources; 14s remote-cache
    .../LastMileCodeEntryUI:LastMileCodeEntryUIResources; 14s remote-cache
    .../AutonomousInRideUI:AutonomousInRideUIResources; 14s remote-cache
    //Modules/NetworkErrorUI:NetworkErrorUIResources; 14s remote-cache

What operating system are you running Bazel on?

macOS 12.3.1

What's the output of bazel info release?

4.0.0

Have you found anything relevant by searching the web?

Here the scheduling state is set:

context.report(ProgressStatus.SCHEDULING, getName());

It's not changed when downloading starts. In the remote-cache-but-no-executor case an event is posted when downloading starts (technically before cache checking even happens):

context.report(ProgressStatus.CHECKING_CACHE, "remote-cache");

It seems like we should do the same CACHE_CHECKING as the non-remote flow, but it might not be that simple because of how the UiStateTracker only allows state to advance forward:

* <p>Because we may receive events out of order, this does nothing if the action is already
* running with this strategy.

If the UI state tracker allowed state to reset, I think it would be ideal to have something like this:

  • Checking cache (if we are): ProgressStatus.CACHE_CHECKING
  • If downloading, in either flow: ProgressStatus.DOWNLOADING
  • If failure of download, or cache-miss, move to ProgressStatus.SCHEDULING
  • When executing, move to ProgressStatus.EXECUTING

Metadata

Metadata

Assignees

Labels

P1I'll work on this now. (Assignee required)team-Remote-ExecIssues and PRs for the Execution (Remote) teamtype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions