-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
Missing caching for HRTB projection equality bounds (for<'x> T: Trait<'x, Assoc = ...>). #99188
Description
Originally reduced from a production application using the tower crate - see @fasterthanlime's reduced repro repo for more background (though note that it exhibits several other failure modes as well)
trait Trait<'a> {
type A;
type B;
type C;
type D;
fn method() {}
}
impl<T> Trait<'_> for &'_ T
where
for<'x> T: Trait<'x, A = (), B = (), C = (), D = ()>,
{
type A = ();
type B = ();
type C = ();
type D = ();
}
impl Trait<'_> for () {
type A = ();
type B = ();
type C = ();
type D = ();
}
pub fn main() {
#[cfg(depth = "7")]
<&&&&&&&() as Trait>::method();
#[cfg(depth = "8")]
<&&&&&&&&() as Trait>::method();
#[cfg(depth = "9")]
<&&&&&&&&&() as Trait>::method();
#[cfg(depth = "10")]
<&&&&&&&&&&() as Trait>::method();
}The above example currently takes an exponential amount of time to compile, based on the type depth:
$ curl -O https://gist.githubusercontent.com/eddyb/cd4221f14fff265280d135ddce5c9712/raw/a17cbf00af1894756d2b1bdd2e838cdd4db2bbe2/proj-exp.rs
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "7"'
took 0.74s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "8"'
took 2.98s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "9"'
took 11.92s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "10"'
took 51.10sWith every extra type layer, the time increases by ~4x, and that aligns well with there being 4 associated types.
While this example is a bit silly, it doesn't take more than two associated types (both constrained at once) to cause the issue (although at higher depth or with a larger constant factor from having additional bounds).
And I'm guessing it might be one of the main causes for applications built with the tower crate to experience long compile times (its main trait, Service, has Response and Error as two associated types), in large part because that's the kind of codebase this was originally reduced from (as per the note at the top of this issue).
The reason I suspect caching is a combination of factors:
- instrumenting
evaluate_predicate_recursivelyfound an exponential ramp in terms of duplicates (i.e. the number of times each unique obligation shows up), and many of them wereProjectionPredicates- this was done via
-Z self-profile, thoughRUSTC_LOGmight also work (and not require compiler changes)
- this was done via
- many of those
ProjectionPredicates' had a bound lifetime listed in theirBinder - attempting to create a
ProjectionCacheKeyreturnsNoneiff there are bound variables- the comment is a bit worrying, given that it e.g. talks about "escaping regions" but checks for bounds ones
- (as an aside,
ProjectionCacheKeynot holding aParamEnvis probably risky long-term) - I've talked to @nikomatsakis and we're unsure if this is a historical artifact - my best guess is that it might have something to do with the fact that normalizing under binders didn't use to work (maybe a micro-opt to not even bother trying to cache those cases? were they that common?)
- replacing the
for<'x> T: Trait<'x, ...withT: Trait<'static, ...removes the exponential slowdown
So the next step was to to try always caching (warning: this is actually an unsound quick hack, ProjectionCacheKey should be modified to carry a Binder<ProjectionTy> instead of a ProjectionTy):
diff --git a/compiler/rustc_trait_selection/src/traits/project.rs b/compiler/rustc_trait_selection/src/traits/project.rs
index b3e7fbb3578..5d5f2f67842 100644
--- a/compiler/rustc_trait_selection/src/traits/project.rs
+++ b/compiler/rustc_trait_selection/src/traits/project.rs
@@ -2123,7 +2123,7 @@ fn from_poly_projection_predicate(
let infcx = selcx.infcx();
// We don't do cross-snapshot caching of obligations with escaping regions,
// so there's no cache key to use
- predicate.no_bound_vars().map(|predicate| {
+ Some(predicate.skip_binder()).map(|predicate| {
ProjectionCacheKey::new(
// We don't attempt to match up with a specific type-variable state
// from a specific call to `opt_normalize_projection_type` - ifHowever, the above hack does not appear to help and we're unsure why - the main other kind of Predicate that evaluate_predicate_recursively appears to process is TraitPredicate, which IIUC is almost always locally cached?
(EDIT: turns out that the caching is more complex than initially assumed and requires further changes - see #99188 (comment) for a success)
Then again, there are other reductions that don't involve ProjectionPredicate and still cause their own exponential curves, so there might be several different caching issues.
I primarily wanted to open this issue on its own because there's definitely something weird with ProjectionCacheKey (even if a proper fix might be way more involved than just it), and there's a clear correlation between the number of associated types (being constrained) and the exponential curve.
cc @rust-lang/types