Skip to content

Comments

Coerce strings fixups#7641

Merged
thufschmitt merged 6 commits intoNixOS:masterfrom
layus:coerce-strings-fixups
Jan 23, 2023
Merged

Coerce strings fixups#7641
thufschmitt merged 6 commits intoNixOS:masterfrom
layus:coerce-strings-fixups

Conversation

@layus
Copy link
Member

@layus layus commented Jan 19, 2023

Fix for #6204, which was reverted in #7621.

The last commit makes builtins.addErrorTrace a so called "frame" trace, meaning that it is never hidden. If a message was written to guide debugging, we better print it. And it unbreaks the nixpkgs-lib-tests which check for the presence of certain handcrafted traces in the output.

Now, this PR also contains important fixes that were forgotten in the original one for some reason.
ca7c5e0 in particular is pretty big, as apparently I managed to pass arguments on the wrong order to coerceToString, and the type system happily takes a boolean for a string, and vice-versa. This means that the previous version was severely broken with respect to the boolean flags of some coerceToString calls.
Sadly, the diff is quite ugly due to the huge revert commit at the base. Take a chance at investigating individual commits.

  • This reverts the revert patch, as it was simpler than merges/rebases.
  • This keeps 620e4fb
  • Local nix flake check

I am still a bit puzzled by the infinite recursion error, but this whole patchset seems to get rid of it. Not sure it's worth investigating further.

@tomberek
Copy link
Contributor

I am still a bit puzzled by the infinite recursion error, but this whole patchset seems to get rid of it. Not sure it's worth investigating further.

I’m pretty sure the referenced “infinitive recursion error” is only the error message changes, not that something is now infinite and was not before. Or at least I’ve yet to see anything else. I’m belaboring this point because, the test makes this a bit inscrutable to see and I think people reacted thinking that this was an unexpected change in evaluation result.

The extra frame is useful to add back in, but I’d be willing to change the Nixpkgs test regardless because overall the output is far more useful. The diff is only in “—show-trace” of the module evaluation which is likely inscrutable to nearly all of our users. While this PR makes that output much more useful, with or without the frame trace.

@edolstra
Copy link
Member

If a message was written to guide debugging, we better print it.

Is this actually true? The same could be said about the trace messages from the evaluator. There could be a lot of spammy debug messages that shouldn't be shown by default.

@tomberek
Copy link
Contributor

tomberek commented Jan 19, 2023

@edolstra The nixpkgs test is using “—show-trace”. https://github.com/NixOS/nixpkgs/blob/master/lib/tests/modules.sh#L21

So the test failed because it was checking the output of an already indecipherable show-trace output, not the output most users should see. We should consider that specific Nixpkgs test a suggestion, and we should have just updated the test.

* Traces
* ------
*
* The semantics of traces is a bit weird. We have only one option to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird

That's not unlike a stack trace in other languages, or am I completely missing the point?
Maybe you mean that the implementation is weird?

Suggested change
* The semantics of traces is a bit weird. We have only one option to
* The implementation of traces is a bit weird. We have only one option to

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends where you stand from I think. Its the semantics of traces that is weird in a lazy & declarative programming language with respect to imperative traces. They do not mean the same thing.
But yes, the implementation of traces here is weird, just because we cannot do it differently given the nix language features.

* print them and to make them verbose (--show-trace). In the code they
* are always collected, but they are not printed by default. The code
* also collects more traces when the option is on. This means that there
* is no way to print the simplified traces at all.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept of simplified traces is new to me.
I wonder to what degree the distinction between the trace items was intentional. (But I'm still learning about the trace implementation)

struct Trace {
std::shared_ptr<AbstractPos> pos;
hintformat hint;
bool frame;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't managed to reverse engineer the meaning of this yet. Could you add a comment explaining the effect of this flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "importance", or "scope" ? The thing is, I produce a lot of trace elements about details of the nix expressions. If this is done everywhere, it gets too long to be useful. but if I print details until I reach a broader scope trace (like a function call), I stop printing the details and only print function calls.

All this fuss happened because I considered that addErrorTrace was a local trace, but it makes more sense to have it also in the stack trace. It is extremely difficult to strike the right balance here, and this is just a naive attempt.

Also, frame kind of makes sense, because it corresponds to a trace about a stack frame (well, initially, before adding addErrorTrace). I know technically all of the expressions correspond to calling functions, but I have decided to consider builtins calls as normal expressions, including addErrorTrace. This is just a work of printing lots of expressions and getting them all look good.

To make progress on this, we need a collection of error-generating expressions. Then we can work on comparing them, and finding what is the best info to display. This PR only tries to add context to some coercions, not to fix all the traces at once.

Copy link
Member

@roberth roberth Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is going to be tech debt then.

UPDATE: we have some testing for traces by now, but not sufficient to undraft the following, I believe

}

void addTrace(std::shared_ptr<AbstractPos> && e, hintformat hint);
void addTrace(std::shared_ptr<AbstractPos> && e, hintformat hint, bool frame = false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default has caused a problem. Should it be inverted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, it seems.

@roberth
Copy link
Member

roberth commented Jan 19, 2023

If a message was written to guide debugging, we better print it.

Is this actually true? The same could be said about the trace messages from the evaluator. There could be a lot of spammy debug messages that shouldn't be shown by default.

With --show-trace, absolutely, and without it, there's a good case to be made; see #7553.
We don't have spammy addErrorContext in NixOS, but we might want to restrict to a certain number of lines in case some other library does have it.

@roberth
Copy link
Member

roberth commented Jan 19, 2023

@tomberek

I’m pretty sure the referenced “infinitive recursion error” is only the error message changes, not that something is now infinite and was not before.

That is correct.

The extra frame is useful to add back in, but I’d be willing to change the Nixpkgs test regardless because overall the output is far more useful.

Not sure what you mean? I would not be willing to change it because the added error context is crucial.

The diff is only in “—show-trace” of the module evaluation which is likely inscrutable to nearly all of our users.

Long traces are very unattractive, but anyone who's motivated could figure out which module options are involved in an error.
In the old days it was very easy to see which trace items were about options, because the messages were close together and visually aligned.

While this PR makes that output much more useful, with or without the frame trace.

Wait, are we solving #7553 and referring to that?

@tomberek
Copy link
Contributor

Sample output for the same tests as referred to earlier.

       … while calling 'optional'

         at /nix/store/zipyafg6vvg329q2qg5dpql7bh6da1f2-source/lib/lists.nix:255:20:

          254|   */
          255|   optional = cond: elem: if cond then [elem] else [];
             |                    ^
          256|

       … while calling anonymous lambda

         at /nix/store/zipyafg6vvg329q2qg5dpql7bh6da1f2-source/lib/modules.nix:481:44:

          480|       context = name: ''while evaluating the module argument `${name}' in "${key}":'';
          481|       extraArgs = builtins.mapAttrs (name: _:
             |                                            ^
          482|         builtins.addErrorContext (context name)

       … while evaluating the module argument `custom' in "/home/tom/nixpkgs/lib/tests/modules/import-custom-arg.nix":

       error: infinite recursion encountered

       at /nix/store/zipyafg6vvg329q2qg5dpql7bh6da1f2-source/lib/modules.nix:483:28:

          482|         builtins.addErrorContext (context name)
          483|           (args.${name} or config._module.args.${name})
             |                            ^
          484|       ) (lib.functionArgs f);

Of specific note is the the new output: … while evaluating the module argument custom' in "/home/tom/nixpkgs/lib/tests/modules/import-custom-arg.nix":` and that the order is a bit more clear - no more having to scroll to the beginning in long module evaluations!

@roberth
Copy link
Member

roberth commented Jan 20, 2023

no more having to scroll

That's thanks to #7334. These changes make a great combination.

@layus
Copy link
Member Author

layus commented Jan 20, 2023

@edolstra

If a message was written to guide debugging, we better print it.

Is this actually true? The same could be said about the trace messages from the evaluator. There could be a lot of spammy debug messages that shouldn't be shown by default.

There could be indeed. But keep in mind that they are still hidden behind --show-trace. So this change is not about always printing all the addErrorContext messages. It's about always having them in the full trace, and visible with --show-trace. It could get spammy at some point, and we could craft better rules about these, but the old rule was definitely too strict.

layus and others added 2 commits January 20, 2023 13:01
@tomberek
Copy link
Contributor

Now, this PR also contains important fixes that were forgotten in the original one for some reason.
ca7c5e0 in particular is pretty big, as apparently I managed to pass arguments on the wrong order to coerceToString, and the type system happily takes a boolean for a string, and vice-versa. This means that the previous version was severely broken with respect to the boolean flags of some coerceToString calls.
Sadly, the diff is quite ugly due to the huge revert commit at the base. Take a chance at investigating individual commits.

@layus I'm doing a review of the "forgotten commit". Can you clarify where we had the incorrect usage of coerceToString and an example of the incorrect behavior? I'm just struggling to characterize exactly what is being fixed.

auto message = state.coerceToString(pos, *args[0], context,
"while evaluating the error message passed to builtins.addErrorContext",
false, false).toOwned();
e.addTrace(nullptr, message);
Copy link
Member Author

@layus layus Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomberek: here for example:

state.coerceToString(pos, *args[0], context, "while evaluating the error message passed to builtins.addErrorContext")

The string ("while ...") was passed in place of the first bool argument (copyToStore) while it should really be passed after three bools corresponding to the default values of the three parameters.

state.forceValue(*attr.value, attr.pos);
if (attr.value->type() == nPath || attr.value->type() == nString) {
auto s = state.coerceToString(attr.pos, *attr.value, context, false, false, "").toOwned();
auto s = state.coerceToString(attr.pos, *attr.value, context, "", false, false).toOwned();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomberek: here it was missing one bool, and the empty string was passed in lieu of the third bool.
Overall, it was silly to provide a default value for that error context, which I wanted to be required. So it makes sense to move it before optional arguments.

return v.external->coerceToString(positions[pos], context, coerceMore, copyToStore, errorCtx);
if (v.type() == nExternal) {
try {
return v.external->coerceToString(positions[pos], context, coerceMore, copyToStore);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomberek: This one is very special, as I did not want to change the interface of external values. That is the reason why I first introduced the context as an extra, optional argument. Now it's a required argument, and thus to keep the external values interface the same, we just omit it there, catch errors, and add the context ourselves.

Copy link
Contributor

@tomberek tomberek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From meeting we discussed only merging the revert and fix to addErrorContext. Upon further review I'm more confident in the full PR as it fixes incorrect calls; for example to builtins.addErrorContext which was involved in the original issue. This meant that the intended error context was interpreted as a bool and not used.

@layus
Copy link
Member Author

layus commented Jan 23, 2023

@tomberek Thank you for reviewing this one in detail. ❤️

@thufschmitt
Copy link
Member

Agreed with @tomberek. And the “forgotten commit” looks very sensible too. So let's merge

@thufschmitt thufschmitt merged commit 90e630a into NixOS:master Jan 23, 2023
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-01-23-nix-team-meeting-minutes-26/25433/1

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-54/39990/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants