Shuffle BuildResult data definition, make state machine clearer, introduce SingleDrvOutputs#6312
Conversation
|
Failure is spurious. |
83f5782 to
513e904
Compare
KeyedBuildResult, put back BuildResult the way it was beforeBuildResult data definition
BuildResult data definitionBuildResult data definition
BuildResult data definitionBuildResult data definition
Why would it be restarted? The only place where we restart is in |
|
Fair enough, but I think setting a needs-restart flag that is never read is at least as confusing as changing the wanted outputs. I still prefer the way it was before, because the metadata on the accurately reflects what it is doing. |
BuildResult data definitionBuildResult data definition, make state machine clearer
059944f to
13aee06
Compare
|
Data model improvement and worker protocol no-ops lgtm. |
roberth
left a comment
There was a problem hiding this comment.
Suggesting some improvements to buildPathsWithResults.
Another issue I had was the heavy abbreviation of local variables, making the code harder to understand. I'm already suggesting some things that can be factored out, so perhaps it won't be as much of a problem anymore.
req/reqs: this is fine
g will only span 3 lines, but why not goal?
reqs2 -> state as mentioned
gp: only used once, but goalPtr is still easier to read
pbp: might go away?
bp: contains multiple outputs. We've already established that that's an unnecessary grouping in most cases, so we'll probably revisit that anyway.
bos: move to goal method, change to results.
src/libstore/build/entry-points.cc
Outdated
| return worker.makePathSubstitutionGoal(bo.path, buildMode == bmRepair ? Repair : NoRepair); | ||
| }, | ||
| }, br.raw()); | ||
| }, req.raw()); |
There was a problem hiding this comment.
This visit could be moved into worker.
DerivedPath is a very fundamental thing, so I think this would be appropriate.
make is a lie, but that's out of scope for now.
There was a problem hiding this comment.
Goal is also a lie. MutableWorkUnit? Not sure if that name helps though. Let's not move the Goal posts for this PR.
There was a problem hiding this comment.
The first commit of #4628 is I think just what you want! Added it here.
src/libstore/build/entry-points.cc
Outdated
| }); | ||
|
|
||
| auto pbp = std::get_if<DerivedPath::Built>(&req); | ||
| if (!pbp) continue; |
There was a problem hiding this comment.
It seems that we could return a BuildResult for opaque paths too. Substituted, AlreadyValid, NoSubstituters are appropriate statuses.
However, that might be somewhat of a breaking change. Maybe add a new method in a followup?
There was a problem hiding this comment.
Oh we are, that continue is after making a KeyedBuildResult.
This is a sign to make the continue is too tricky, so I am just doing a regular if block on the rest.
src/libstore/build/entry-points.cc
Outdated
| /* Because goals are in general shared between derived paths | ||
| that share the same derivation, we need to filter their | ||
| results to get back just the results we care about. | ||
| */ |
There was a problem hiding this comment.
Aforementioned lies make this non-obvious. Short of that, I think we can simplify this a bit
/* Goals are mutable shared state, that may contain more
outputs than we requested.
*/
ie only the weird part. However, I think this code should be moved.
Part of the problem is the gp->buildResult is too easy. How about we isolate the nastiness of mutable state by encapsulating gp? Then we can provide a method that requests the right outputs and make this mistake impossible.
8fa1336 to
94fba53
Compare
| if (!useDerivation) return; | ||
| auto & fullDrv = *dynamic_cast<Derivation *>(drv.get()); | ||
|
|
||
| auto * dg = dynamic_cast<DerivationGoal *>(&*waitee); | ||
| if (!dg) return; | ||
|
|
||
| auto outputs = fullDrv.inputDrvs.find(dg->drvPath); | ||
| if (outputs == fullDrv.inputDrvs.end()) return; | ||
|
|
||
| for (auto & outputName : outputs->second) { | ||
| auto buildResult = dg->getBuildResult(DerivedPath::Built { | ||
| .drvPath = dg->drvPath, | ||
| .outputs = OutputsSpec::Names { outputName }, | ||
| }); | ||
| if (buildResult.success()) { | ||
| for (auto & [output, realisation] : buildResult.builtOutputs) { | ||
| inputDrvOutputs.insert_or_assign( | ||
| { bfd->drvPath, output.outputName }, | ||
| { dg->drvPath, output.outputName }, | ||
| realisation.outPath); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This became more awkward after buildResult was encapsulated because one cannot easily index builtOutputs without knowing the hash modulo / fixed output hash.
I think what this means is that
DrvPaths builtOutputs;
is the wrong field for BuildResult; instead of that std::map<DrvOutput, Realisation> it should be
std::map<std::string, Realisation> builtOutputs;
since we are only returning realizations for a single top-level derivation.
Realisation.id.drvHash gets us that hash, that was removed from the map key.
There was a problem hiding this comment.
I pushed one more commit which addresses this awkwardness, getting rid of the double for loop.
94fba53 to
eb18102
Compare
BuildResult data definition, make state machine clearerBuildResult data definition, make state machine clearer, introduce SingleDrvOutputs
| void write(const Store & store, Sink & to, const BuildResult & res) | ||
| { | ||
| worker_proto::write(store, to, res.path); | ||
| to | ||
| << res.status | ||
| << res.errorMsg | ||
| << res.timesBuilt | ||
| << res.isNonDeterministic | ||
| << res.startTime | ||
| << res.stopTime; | ||
| worker_proto::write(store, to, res.builtOutputs); | ||
| DrvOutputs builtOutputs; | ||
| for (auto & [output, realisation] : res.builtOutputs) | ||
| builtOutputs.insert_or_assign(realisation.id, realisation); | ||
| worker_proto::write(store, to, builtOutputs); | ||
| } |
There was a problem hiding this comment.
Isn't this a breaking change to the protocol?
There was a problem hiding this comment.
Err my intent was to not change the protocol, converting it to the old way. (I should add that the commit message and PR description!)
There was a problem hiding this comment.
The top line is changed in a previous commit; that is the difference between BuildResult and KeyedBuildResult. The latter is now used everywhere needed to preserve the protocol.
There was a problem hiding this comment.
For what its worth, after #6223 is merged the factored out serialization logic can use protocol versions, and then I can deduplicate this stuff much more aggressively :)
There was a problem hiding this comment.
I didn't really register the other changes to the write functions. This makes sense. Does the daemon compatibility test pass?
There was a problem hiding this comment.
Good question. Yes it does!
This takes a `DerivedPath` so the caller doesn't need to care about which sort of goal does what.
…er way In NixOS#6311 (comment), I realized since derivation goals' wanted outputs can "grow" due to overlapping dependencies (See `DerivationGoal::addWantedOutputs`, called by `Worker::makeDerivationGoalCommon`), the previous bug fix had an unfortunate side effect of causing more pointless rebuilds. In paticular, we have this situation: 1. Goal made from `DerivedPath::Built { foo, {a} }`. 2. Goal gives on on substituting, starts building. 3. Goal made from `DerivedPath::Built { foo, {b} }`, in fact is just modified original goal. 4. Though the goal had gotten as far as building, so all outputs were going to be produced, `addWantedOutputs` no longer knows that and so the goal is flagged to be restarted. This might sound far-fetched with input-addressed drvs, where we usually basically have all our goals "planned out" before we start doing anything, but with CA derivation goals and especially RFC 92, where *drv resolution* means goals are created after some building is completed, it is more likely to happen. So the first thing to do was restore the clearing of `wantedOutputs` we used to do, and then filter the outputs in `buildPathsWithResults` to only get the ones we care about. But fix also has its own side effect in that the `DerivedPath` in the `BuildResult` in `DerivationGoal` cannot be trusted; it is merely the *first* `DerivedPath` for which this goal was originally created. To remedy this, I made `BuildResult` be like it was before, and instead made `KeyedBuildResult` be a subclass wit the path. Only `buildPathsWithResults` returns `KeyedBuildResult`s, everything else just becomes like it was before, where the "key" is unambiguous from context. I think separating the "primary key" field(s) from the other fields is good practical in general anyways. (I would like to do the same thing for `ValidPathInfo`.) Among other things, it allows constructions like `std::map<Key, ThingWithKey>` where doesn't contain duplicate keys and just precludes the possibility of those duplicate keys being out of sync. We might leverage the above someday to overload `buildPathsWithResults` to take a *set* of return a *map* per the above. ----- Unfortunately, we need to avoid C++20 strictness on designated initializers. (BTW https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2287r1.html this offers some new syntax for this use-case. Hopefully this will be adopted and we can eventually use it.) No having that yet, maybe it would be better to not make `KeyedBuildResult` a subclass to just avoid this. Co-authored-by: Robert Hensing <[email protected]>
If my memory is correct, @edolstra objected to modifying `wantedOutputs` upon falling back to doing a build (as we did before), because we should only modify it in response to new requests --- *actual* wants --- and not because we are "incidentally" building all the outptus beyond what may have been requested. That's a fair point, and the alternative is to replace the boolean soup with proper enums: Instead of modifying `wantedOuputs` som more, we'll modify `needsRestart` to indicate we are passed the need.
In many cases we are dealing with a collection of realisations, they are all outputs of the same derivation. In that case, we don't need "derivation hashes modulos" to be part of our map key, because the output names alone will be unique. Those hashes are still part of the realisation proper, so we aren't loosing any information, we're just "normalizing our schema" by narrowing the "primary key". Besides making our data model a bit "tighter" this allows us to avoid a double `for` loop in `DerivationGoal::waiteeDone`. The inner `for` loop was previously just to select the output we cared about without knowing its hash. Now we can just select the output by name directly. Note that neither protocol is changed as part of this: we are still transferring `DrvOutputs` over the wire for `BuildResult`s. I would only consider revising this once NixOS#6223 is merged, and we can mention protocol versions inside factored-out serialization logic. Until then it is better not change anything because it would come a the cost of code reuse.
11dd418 to
24866b7
Compare
| break; | ||
| case RetrySubstitution::YesNeed: | ||
| // Should not be able to reach this state from here. | ||
| assert(false); |
There was a problem hiding this comment.
Unfortunately, it has been reached
Motivation
There are two changes going on here: changing
BuildResultand changing the goal state machine logic.Introduce
Worker::makeGoalThis takes a
DerivedPathso the caller doesn't need to care about which sort of goal does what.BuildResultchangesMake
BuildResultbe like it was before, and instead madeKeyedBuildResultbe a subclass wit the path. OnlybuildPathsWithResultsreturnsKeyedBuildResults, everything else just becomes like it was before, where the "key" is unambiguous from context.The idea here is that storing the path when it is already unambiguous from context denormalizes our data model, making two sources of truth that must be kept in sync. This isn't ideal.
This is especially notable for
DerivationGoal, where twoDerivedPathrequests may correspond to the sameDerivationGoal. Before, theBuildResultof that goal (stored, then returned) would just be the first one to create that goal. That is non-deterministic!KeyedBuildResultis useful in the return type ofKeyedBuildResult, but this is because it is conceptually returning an mapping ofDerivedPathkeys toBuildResultvalues. We could, for example, also usestd::map<BuildResult, BuildResult>for this purpose (and probably also have the arguments be astd::setnotstd::vectorfor symmetry), in which case we would here also not needKeyedBuildResult.State machine changes
The
needRestart,retrySubstitution, andretriedSubstitutionbools are replaced with enums. The enum cases and conditional logic are exhaustively documented to try to make the code easier to understand.An especially thorny case with the old code was that
needRestartwas set totrueif the outputs changed for a goal that had progressed to building, butneedRestartwould never be read again. This is an in fact a good thing operationally, as a build will produce all outputs and so restarting is not necessary. But this (a) being how it worked, and (b) being a good thing was quite obfuscated just reading the code!The new code and comments call out this case explicitly.
Introduce
SingleDrvOutputsIn many cases we are dealing with a collection of realisations, they are all outputs of the same derivation. In that case, we don't need "derivation hashes modulos" to be part of our map key, because the output names alone will be unique. Those hashes are still part of the realisation proper, so we aren't loosing any information, we're just "normalizing our schema" by narrowing the "primary key".
Besides making our data model a bit "tighter" this allows us to avoid a double
forloop inDerivationGoal::waiteeDone. The innerforloop was previously just to select the output we cared about without knowing its hash. Now we can just select the output by name directly.Context
Protocol non-changes
Note that neither protocol is changed as part of this: we are still transferring
DrvOutputsover the wire forBuildResults. I would only consider revising this once #6223 is merged, and we can mention protocol versions inside factored-out serialization logic. Until then it is better not change anything because it would come a the cost of code reuse.Initializers
Unfortunately, we need to avoid C++20 strictness on designated initializers.
(BTW https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2287r1.html
this offers some new syntax for this use-case. Hopefully this will be
adopted and we can eventually use it.)
No having that yet, maybe it would be better to not make
KeyedBuildResulta subclass to just avoid this.History of this PR
In #6311 (comment), I thought that since derivation goals' wanted outputs can "grow" due to overlapping dependencies (See
DerivationGoal::addWantedOutputs, called byWorker::makeDerivationGoalCommon), the previous bug fix had an unfortunate side effect of causing more pointless rebuilds.In particular, I was worried about this situation:
Goal made from
DerivedPath::Built { foo, OutputsSpec::Names { a } }.Goal gives up on on substituting, starts building.
Goal made from
DerivedPath::Built { foo, OutputsSpec::Names { b } }, in fact is just modified original goal.Though the goal had gotten as far as building, so all outputs were going to be produced,
addWantedOutputsno longer knows that and so the goal is flagged to be restarted.This might sound far-fetched with input-addressed drvs, where we usually basically have all our goals "planned out" before we start doing anything, but with CA derivation goals and especially RFC 92, where drv resolution means goals are created after some building is completed, it would be more likely to happen.
@edolstra, on the other hand, thought this was not an issue because even though
needsRestart = truewould be set, it would never be read again. Eelco might be right, but nonetheless this is still very confusing storing "needs restart" but in fact neither needing nor wanting a restart!So the first thing I did was restore the clearing of
wantedOutputswe used to do, and then filter the outputs inbuildPathsWithResultsto only get the ones we care about.The
KeyedBuildResultchange technically was already needed due to the "modified derivational goal with second request" issue described in the motivation section, but become more urgent with this modification ofwantedOutputs.Eelco however (if I recalled correctly) in real time meeting (so no paper trail) also didn't like modifying
wantedOutputsupon falling back to doing a build (as we did before), because we should only modify it in response to new requests --- actual wants --- and not because we are "incidentally" building all the outptus beyond what may have been requested.That's a fair point, and the alternative is to replace the boolean soup with proper enums: Instead of modifying
wantedOuputssom more, we'll modifyneedsRestartto indicate we are passed the need. That lead me to thebool->enumchanges, and then I did the substitution ones in addition too for consistency."Keyed" data types in general.
I think separating the "primary key" field(s) from the other fields is good practical in general. (I would like to do the same thing for
ValidPathInfo, for example.) Besides the example in the motivation, there are probably other cases where we would like to dostd::map<Key, ThingWithoutKey>where the "without keys" is necessary to not contain duplicate keys and just precludes the possibility of those duplicate keys being out of sync.Checklist for maintainers
Maintainers: tick if completed or explain if not relevant
tests/**.shsrc/*/teststests/nixos/*