Skip to content

Diagnostics ICE when replaying proof trees with next-solver#154329

Open
VicenteGusmao wants to merge 2 commits intorust-lang:mainfrom
VicenteGusmao:fix-bug-151304
Open

Diagnostics ICE when replaying proof trees with next-solver#154329
VicenteGusmao wants to merge 2 commits intorust-lang:mainfrom
VicenteGusmao:fix-bug-151304

Conversation

@VicenteGusmao
Copy link
Copy Markdown

@VicenteGusmao VicenteGusmao commented Mar 24, 2026

When diagnostics replay proof tree state, rebuilding a canonical state can fail to match the current inference state. With -Znext-solver=globally, this could panic while reporting an error, avoiding the panic.

Make proof tree replay fallible in diagnostics and fall back to the current obligation when replay fails. Add a regression test for the higher-ranked PartialEq and PartialOrd case. Fixes #151304.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 24, 2026

Some changes occurred to the core trait solver

cc @rust-lang/initiative-trait-system-refactor

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver) labels Mar 24, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Mar 24, 2026

r? @TaKO8Ki

rustbot has assigned @TaKO8Ki.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: compiler, types
  • compiler, types expanded to 69 candidates
  • Random selection from 13 candidates

@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@BoxyUwU
Copy link
Copy Markdown
Member

BoxyUwU commented Mar 25, 2026

r? lcnr

@rustbot rustbot assigned lcnr and unassigned TaKO8Ki Mar 25, 2026
@lcnr
Copy link
Copy Markdown
Contributor

lcnr commented Apr 2, 2026

this is a surprising failure, while revisiting goals can get different results with proof trees, instantiating the response generally shouldn't fail 🤔 can you provide some more detail how this test ends up returning some canonical state we cannot instantiate. What is the constraint we get when replying the proof tree but not while actually proving the goal

please generally move fix #issuenum into the PR description instead of the title

@lcnr
Copy link
Copy Markdown
Contributor

lcnr commented Apr 2, 2026

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 2, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 2, 2026

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@VicenteGusmao VicenteGusmao changed the title Fix #151304: Diagnostics ICE when replaying proof trees with next-solver Diagnostics ICE when replaying proof trees with next-solver Apr 6, 2026
@VicenteGusmao
Copy link
Copy Markdown
Author

The failure is not in the original proof of the goal, but in the diagnostics-only replay of a stored CanonicalState.

We save a canonical state while proving the goal, and later try to replay that state after more trait solving has happened in the same infcx. At that point, the saved var_values no longer necessarily unify with the current orig_values, so instantiating the replayed state can fail even though the original goal evaluation itself completed normally.

For this testcase, the ICE only happens with -Znext-solver=globally, and only while reconstructing nested goals from proof trees for error reporting.

I have not yet isolated the exact replay mismatch for this testcase, i.e. the specific constraint that is present during replay but not during the original proof, but I can instrument that next if that would be useful.

I have also moved the #issuenum from the title to the PR description.

@VicenteGusmao
Copy link
Copy Markdown
Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 6, 2026
@ShoyuVanilla
Copy link
Copy Markdown
Member

We save a canonical state while proving the goal, and later try to replay that state after more trait solving has happened in the same infcx. At that point, the saved var_values no longer necessarily unify with the current orig_values, so instantiating the replayed state can fail even though the original goal evaluation itself completed normally.

This is a bit incorrect. We usually unify the var_values from the query response with the original values when the goal evaluation succeeds, to apply what we have learned from the goal evaluation. But in this case, the goal has failed so we haven't unified them before instantiating the nested goals for proof tree to make diagnostics.

What happens in the problematic code is the following:

We have a failed root goal for<T> T: PartialEq to emit fulfillment error so we try to instantiate it within proof tree.

But as there aren't any orig value for T at the beginning when we have canonicalized it, we insert a fresh ty var for it in L325.

// In case any fresh inference variables have been created between `state`
// and the previous instantiation, extend `orig_values` for it.
orig_values.extend(
state.value.var_values.var_values.as_slice()[orig_values.len()..]
.iter()
.map(|&arg| delegate.fresh_var_for_kind_with_span(arg, span)),
);
let instantiation =
compute_query_response_instantiation_values(delegate, orig_values, &state, span);
let inspect::State { var_values, data } = delegate.instantiate_canonical(state, instantiation);
unify_query_var_values(delegate, param_env, orig_values, var_values, span);

But this inserts a new ty var in the infcx's current universe rather than the universe of var kind bound to that canonical var value, which is a HR as being in for<T>

fn fresh_var_for_kind_with_span(
&self,
arg: ty::GenericArg<'tcx>,
span: Span,
) -> ty::GenericArg<'tcx> {
match arg.kind() {
ty::GenericArgKind::Lifetime(_) => {
self.next_region_var(RegionVariableOrigin::Misc(span)).into()
}
ty::GenericArgKind::Type(_) => self.next_ty_var(span).into(),

So, the error is created while trying to unify the ty var from root universe to the binder's universe and I guess we should fix fresh_var_for_kind_with_span to create variable in the relevant universe.

@lcnr
Copy link
Copy Markdown
Contributor

lcnr commented Apr 13, 2026

so this is caused by a universe error 🤔 yeah, that's an annoying problem, so we should instantiate the canonical state and between that and unifying the var_values should we create the infer vars 🤔

unsure how to impl this as we don't support instantiating a response without guessing having a var_value guess first. I guess we could create infer vars which are just in the MAX_UNIVERSE. I think this shouldn't blow up anything

@VicenteGusmao
Copy link
Copy Markdown
Author

Ahh, that makes sense — thanks for the clarification.

You're right that my previous explanation was off.

So the core issue seems to be that fresh_var_for_kind_with_span is creating inference variables in the wrong universe for these higher-ranked canonical vars, which then leads to a universe mismatch during unification.

For possible fixes, I see two options:

  • Adjust fresh_var_for_kind_with_span to create variables in the correct universe (more principled)
  • Or create fresh vars in MAX_UNIVERSE in this diagnostics-only path (more pragmatic)

I can try prototyping either approach. Do you have a preference for which direction better fits the solver design?

@lcnr
Copy link
Copy Markdown
Contributor

lcnr commented Apr 16, 2026

  • Adjust fresh_var_for_kind_with_span to create variables in the correct universe (more principled)

hard to do as with the current setup we need to crate the variables before we enter create these new universes. so the second alternative should be easier

@lcnr
Copy link
Copy Markdown
Contributor

lcnr commented Apr 16, 2026

in case you are using an LLM to generate/cleanup your messages, please don't. try to state your current understanding in your own words, otherwise it's hard to figure out your current understanding and what I actually need to respond to. It also makes it hard to know whether it's worth it to correct cases where the perspective/framing is slightly off. Even if writing it yourself is less structured or clear, that's still preferable to me. idk if there's a more complete writeup i can point u to here

@VicenteGusmao
Copy link
Copy Markdown
Author

Okay, I'll keep my comments more direct.

I'll try to implement the second approach.

@rustbot

This comment has been minimized.

@rustbot

This comment has been minimized.

VicenteGusmao and others added 2 commits April 28, 2026 07:55
When diagnostics replay proof tree state, rebuilding a canonical
state can fail to match the current inference state. With
-Znext-solver=globally, this could panic while reporting an error.

Make proof tree replay fallible in diagnostics and fall back to
the current obligation when replay fails. Add a regression test
for the higher-ranked PartialEq and PartialOrd case.

Signed-off-by: Vicente Gusmão <[email protected]>
…ual universe mismatch during proof tree replay.

Since my last implementation was not really fixing the problem I removed the functions: try_unify_query_var_values, try_instantiate_canonical_state and InspectCandidate::try_instantiate_nested_goals. I also reverted the diagnostic callers in derive_errors.rs to use instantiate_nested_goals again.

For my new implementation, in the rustc_trait_selection  I changed fresh_var_for_kind_with_span to fresh_var_for_kind_in_universe which now uses next_region_var_in_universe, next_ty_var_in_universe and next_const_var_in_universe. Also in instantiate_canonical_state it is now ensured that infcx has all the universes up to prev_universe + state.max_universe, and creates any replay-only fresh vars in that explicit universe.  I also split out compute_query_response_instantiation_values_in_universe so replay uses that same explicit base universe when reconstructing placeholders from the canonical state.

Signed-off-by: Vicente Gusmão <[email protected]>
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 28, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ICE: called Result::unwrap() on an Err value: Mismatch

7 participants