Skip to content

Refactor buildkit init in daemon startup#48950

Closed
cpuguy83 wants to merge 1 commit intomoby:masterfrom
cpuguy83:buildkit_init_timeout
Closed

Refactor buildkit init in daemon startup#48950
cpuguy83 wants to merge 1 commit intomoby:masterfrom
cpuguy83:buildkit_init_timeout

Conversation

@cpuguy83
Copy link
Copy Markdown
Member

@cpuguy83 cpuguy83 commented Nov 25, 2024

Refactor buildkit init in daemon startup

  1. Move the "Daemon has completed initialization" log to where it has
    actually completed initialization.
  2. Move buildkkt init to its own function.

@cpuguy83
Copy link
Copy Markdown
Member Author

cpuguy83 commented Nov 25, 2024

After some futher investigation, this buildkir PR seems to be resolving the issue: moby/buildkit#5550
It's also making docker buildx du basically instantaneous, rather than taking several minutes.

@cpuguy83
Copy link
Copy Markdown
Member Author

I've updated this to just remove the timeout minor refactor.
These are done in separate commits so the timeout change can be backported easily.

@thaJeztah thaJeztah added this to the 28.0.0 milestone Nov 26, 2024
@thaJeztah
Copy link
Copy Markdown
Member

Failure looks consistent; not sure how it relates directly though

=== Failed
=== FAIL: amd64.integration.volume TestVolumesRemoveSwarmEnabled (1.93s)
    volume_test.go:157: [dd55180fec946] error while stopping the daemon: exit status 2
    --- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
    --- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
    --- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
    --- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)

Comment thread cmd/dockerd/daemon.go Outdated
c, err := createAndStartCluster(cli, d)
if err != nil {
log.G(ctx).Fatalf("Error starting cluster component: %v", err)
return errors.Wrap(err, "error starting cluster component")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we want to both log and return this one (not sure if the error shows up in logs?)

edit: answering myself; it looks like we do this already for som other errors further up. Looking it such errors were logged, I noticed that for windows we explicitly send it to the logs (but only when running as service), but for unix we don't;

on Windows;

err = cli.start(ctx)
if service != nil {
// When running as a service, log the error, so that it's sent to
// the event-log.
log.G(ctx).Error(err)
}

on Linux;

func runDaemon(ctx context.Context, cli *daemonCLI) error {
return cli.start(ctx)
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, I don't remember changing this line of code.
Maybe I saw it was return fmt.Errorf(...) with a %v and not properly wrapped...

Comment thread cmd/dockerd/daemon.go
Comment thread cmd/dockerd/daemon.go Outdated
Comment on lines +295 to +302
// Get a the current daemon config, because the daemon sets up config
// during initialization. We cannot user the cli.Config for that reason,
// as that only holds the config that was set by the user.
//
// FIXME(thaJeztah): better separate runtime and config data?
daemonCfg := d.Config()
routerOpts, err := newRouterOptions(routerCtx, &daemonCfg, d, c)

bb, sm, err := initBuildkit(ctx, &daemonCfg, d)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where daemonCfg is used, so perhaps we should move the code (and comment?) above inside initBuildkit as it's already having access to the daemon, and uses it for various other purposes.

The code comment was added in 00c9785 to prevent buildkit being initialised with the wrong "host-gateway-ip";

Comment thread cmd/dockerd/daemon.go Outdated
bb, sm, err := initBuildkit(ctx, &daemonCfg, d)
if err != nil {
return err
return fmt.Errorf("error initializing buildkit")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're discarding the original error here

Comment thread cmd/dockerd/daemon.go
Comment on lines 454 to 433
bb, err := buildbackend.NewBackend(d.ImageService(), manager, bk, d.EventsService)
if err != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a fun note; it looks like NewBackend never return an error; not for this PR, but wondering if we should even get rid of the constructor (or remove the error return)

// NewBackend creates a new build backend from components
func NewBackend(components ImageComponent, builder Builder, buildkit *buildkit.Builder, es *daemonevents.Events) (*Backend, error) {
return &Backend{imageComponent: components, builder: builder, buildkit: buildkit, eventsService: es}, nil
}

Comment thread cmd/dockerd/daemon.go Outdated
Comment on lines +459 to +456
return routerOptions{}, errors.Wrap(err, "failed to create buildmanager")
return nil, nil, errors.Wrap(err, "failed to create buildmanager")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we're updating, perhaps we should change this to failed to create build backend, but see above; looks like this never returns an error.

Comment thread cmd/dockerd/daemon.go
cluster *cluster.Cluster
}
func initBuildkit(ctx context.Context, config *config.Config, d *daemon.Daemon) (*buildbackend.Backend, *session.Manager, error) {
log.G(ctx).Info("Initializing buildkit")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit/question: do we use (lowercase) buildkit or BuildKit in our logs?

Comment thread cmd/dockerd/daemon.go Outdated
return routerOptions{}, errors.Wrap(err, "failed to create buildmanager")
return nil, nil, errors.Wrap(err, "failed to create buildmanager")
}
return bb, sm, nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to log here that we finished initialising BuildKit? To make it the counterpart of;

log.G(ctx).Info("Completed initializing BuildKit")

or

log.G(ctx).Info("Finished initializing BuildKit")

@thaJeztah
Copy link
Copy Markdown
Member

thaJeztah commented Nov 26, 2024

I'll move the first commit to a separate PR so that we can test that in isolation and backport it.

edit: created PR:

@cpuguy83 cpuguy83 force-pushed the buildkit_init_timeout branch 2 times, most recently from e29382f to b9ea8a1 Compare November 26, 2024 19:18
@cpuguy83
Copy link
Copy Markdown
Member Author

This should be good to go now.

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOH! I started a review, and have a branch with some of the changes (mostly to look wha they'd look like); I'm AFK for some hours but can push those later for discussion

Comment thread cmd/dockerd/daemon.go Outdated
Comment thread cmd/dockerd/daemon.go Outdated
Comment thread cmd/dockerd/daemon.go
Comment on lines +454 to +439
ro.buildBackend = bb
ro.buildkit = bk
ro.sessionManager = sm
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder what it would look like if we had a separate type for the builder options so that we could return it instead of manipulating the routerOptions 🤔

Comment thread cmd/dockerd/daemon.go Outdated
Comment on lines +440 to +432
Rootless: daemon.Rootless(config),
Rootless: daemon.Rootless(&config),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self; perhaps daemon.Rootless would make more sense to be part of the config package. It's just to abstract that config.Rootless is only defined on Linux (not windows)

Comment thread cmd/dockerd/daemon.go
Rootless: daemon.Rootless(&config),
IdentityMapping: d.IdentityMapping(),
DNSConfig: config.DNSConfig,
ApparmorProfile: daemon.DefaultApparmorProfile(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: maybe same for daemon.DefaultApparmorProfile 🤔

1. Move the "Daemon has completed initialization" log to where it has
   actually completed initialization.
2. Move buildkkt init to its own function.

Signed-off-by: Brian Goff <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the buildkit_init_timeout branch from b7591f8 to 4c7eb3e Compare December 19, 2024 14:15
@thaJeztah
Copy link
Copy Markdown
Member

did a quick rebase after #49078 was merged (minor conflict)

Copy link
Copy Markdown
Contributor

@vvoland vvoland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly, that this PR doesn't actually extend the timeout, only refactors the code. And the actual init timeout was removed in #48953?

If so, I think the PR description needs updating 😅

@thaJeztah thaJeztah changed the title Extend buildkit init timeout for large build caches Refactor buildkit init in daemon startup Feb 6, 2025
@thaJeztah thaJeztah added area/daemon Core Engine kind/refactor PR's that refactor, or clean-up code and removed kind/bugfix PR's that fix bugs labels Feb 6, 2025
@thaJeztah
Copy link
Copy Markdown
Member

I updated the title, description and labels

I updated #49040 as well to squash it into 2 commits, and moved it out of draft; if others prefer that PR, we can merge that one instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/builder/buildkit Build area/daemon Core Engine kind/refactor PR's that refactor, or clean-up code status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants