Replace `Box<dyn TraitEngine>` with an enum. by nnethercote · Pull Request #155714 · rust-lang/rust

nnethercote · 2026-04-24T04:35:30Z

ObligationCtxt contains a Box<dyn TraitEngine> that lets it switch between trait solvers. Unfortunately, many short-lived ObligationCtxts are created and the allocation cost is non-trivial.

This commit introduces an enum DualFulfillmentCtxt that can hold an old or new solver context, avoiding the need for the Box and making things a bit faster.

The change requires some extra FromSolverError bounds in a few places. The commit also removes some unnecessary E: 'tcx bounds.

nnethercote · 2026-04-24T04:35:46Z

@bors try @rust-timer queue

Replace `Box<dyn TraitEngine>` with an enum.

rust-bors · 2026-04-24T06:46:22Z

☀️ Try build successful (CI)
Build commit: 9f7e3b5 (9f7e3b51527a92d1f13294246ced59fe75a572c6, parent: 9836b06b55f5389f605ee7766eeecd9f17a86cb5)

rust-timer · 2026-04-24T07:26:41Z

Finished benchmarking commit (9f7e3b5): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.1%	[0.0%, 3.9%]	11
Improvements ✅ (primary)	-0.6%	[-1.4%, -0.2%]	61
Improvements ✅ (secondary)	-0.5%	[-1.1%, -0.0%]	81
All ❌✅ (primary)	-0.6%	[-1.4%, -0.2%]	61

Max RSS (memory usage)

Results (primary -1.7%, secondary 0.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.2%	[1.4%, 7.5%]	4
Improvements ✅ (primary)	-1.7%	[-1.9%, -1.5%]	2
Improvements ✅ (secondary)	-1.5%	[-2.3%, -0.6%]	7
All ❌✅ (primary)	-1.7%	[-1.9%, -1.5%]	2

Cycles

Results (primary -2.1%, secondary 1.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.7%	[2.0%, 13.9%]	8
Improvements ✅ (primary)	-2.1%	[-2.1%, -2.1%]	1
Improvements ✅ (secondary)	-3.1%	[-4.4%, -2.5%]	5
All ❌✅ (primary)	-2.1%	[-2.1%, -2.1%]	1

Binary size

This perf run didn't have relevant results for this metric.

Bootstrap: 493.683s -> 512.943s (3.90%)
Artifact size: 394.30 MiB -> 400.34 MiB (1.53%)

nnethercote · 2026-05-01T05:09:50Z

Perf is generally good but I'm concerned about the regressions for bitmaps-3.2.1-new-solver. I couldn't reproduce them with a local build. When I ran the command to measure the CI artifacts I could reproduce, here's part of the Cachegrind diff:

--------------------------------------------------------------------------------
-- Function:file summary
--------------------------------------------------------------------------------
  Ir_________  function:file

> 103,170,414  <rustc_type_ir::fast_reject::DeepRejectCtxt<rustc_middle::ty::context::TyCtxt, false, true>>::args_may_unify_inner:???

> -80,923,214  <rustc_next_trait_solver::solve::eval_ctxt::EvalCtxt<rustc_trait_selection::solve::delegate::SolverDelegate, rustc_middle::ty::context::TyCtxt>>::assemble_impl_candidates::<rustc_type_ir::predicate::TraitPredicate<rustc_middle::ty::context::TyCtxt>>::{closure#0}:???
      
>  53,040,075  <rustc_middle::ty::context::TyCtxt as rustc_type_ir::interner::Interner>::impl_trait_ref:???

>  49,855,446  <rustc_middle::ty::context::TyCtxt as rustc_type_ir::interner::Interner>::impl_is_default:???

> -35,100,664  <rustc_type_ir::fast_reject::DeepRejectCtxt<rustc_middle::ty::context::TyCtxt, false, true>>::types_may_unify_inner:???
     
>  13,914,295  <rustc_type_ir::fast_reject::DeepRejectCtxt<rustc_middle::ty::context::TyCtxt, false, true>>::consts_may_unify_inner:???

I'm not 100% sure, but it kinda just looks like changes in inlinining, e.g. some of the DeepRejectCtxt methods.

`ObligationCtxt` contains a `Box<dyn TraitEngine>` that lets it switch between trait solvers. Unfortunately, many short-lived `ObligationCtxt`s are created and the allocation cost is non-trivial. This commit introduces an enum `DualFulfillmentCtxt` that can hold an old or new solver context, avoiding the need for the `Box` and making things a bit faster. The change requires some extra `FromSolverError` bounds in a few places. The commit also removes some unnecessary `E: 'tcx` bounds.

nnethercote · 2026-05-01T05:23:45Z

I rebased. Let's try perf again, just to see if anything has changed...

@bors try @rust-timer queue

Replace `Box<dyn TraitEngine>` with an enum.

rust-bors · 2026-05-01T07:36:48Z

☀️ Try build successful (CI)
Build commit: f2da668 (f2da66887928c6e1f22a040a3b67497121619553, parent: f53b654a8882fd5fc036c4ca7a4ff41ce32497a6)

rust-timer · 2026-05-01T08:17:21Z

Finished benchmarking commit (f2da668): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.2%	[0.0%, 3.9%]	10
Improvements ✅ (primary)	-0.6%	[-1.6%, -0.2%]	67
Improvements ✅ (secondary)	-0.5%	[-1.0%, -0.0%]	86
All ❌✅ (primary)	-0.6%	[-1.6%, -0.2%]	67

Max RSS (memory usage)

Results (primary 1.6%, secondary 1.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.6%	[0.8%, 2.5%]	5
Regressions ❌ (secondary)	2.3%	[0.9%, 4.4%]	11
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.9%	[-3.0%, -2.8%]	2
All ❌✅ (primary)	1.6%	[0.8%, 2.5%]	5

Cycles

Results (secondary -1.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.1%	[1.5%, 5.1%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-8.1%	[-10.3%, -4.3%]	3
All ❌✅ (primary)	-	-	0

Binary size

Results (primary -0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.1%	[0.1%, 0.1%]	1
Improvements ✅ (primary)	-0.0%	[-0.0%, -0.0%]	7
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	3
All ❌✅ (primary)	-0.0%	[-0.0%, -0.0%]	7

Bootstrap: 487.787s -> 505.793s (3.69%)
Artifact size: 390.97 MiB -> 397.01 MiB (1.55%)

rust-bors · 2026-05-03T00:27:45Z

☔ The latest upstream changes (presumably #155767) made this pull request unmergeable. Please resolve the merge conflicts.