-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Do not run mir opts for targets with convergent ops and add convergent attribute #149637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
These commits modify compiler targets. |
|
r? @nnethercote rustbot has assigned @nnethercote. Use |
GPU targets have convergent operations that require careful handling when running optimizations. E.g. they must not be duplicated. An example convergent operation is a barrier/syncthreads. We do not want to deal with convergent operations in mir optimizations, so set the optimization level to 0 and skip all optimizations. This affects the amdgpu and nvptx targets.
On targets with convergent operations, we need to add the convergent attribute to all functions that run convergent operations. Following clang, we can conservatively apply the attribute to all functions when compiling for such a target and rely on LLVM optimizing away the attribute in cases where it is not necessary. This affects the amdgpu and nvptx targets.
25d4193 to
f17636b
Compare
|
Hi @nnethercote, I see you moved this to waiting-on-author. |
|
I have been meaning to raise some reservations about this. I feel like this is a solution that is far too aggressive. A lot of MIR opts have nothing to do with convergence - disabling all of them outright feels excessive. The proper fix would be, in my opinon:
And that would pretty much solve the problem while minimzing the impact. The downside is larger code size, and more thinking about covergence for the implementors of those passes. Granted, this is just my opinion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I reviewed this days ago but forgot to hit "submit review" :(
| }; | ||
|
|
||
| if tcx.sess.target.is_like_gpu { | ||
| // Conservatively apply convergent to all functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message explains that LLVM can subsequently optimize these away, which is useful info to replicate here.
| let mut attrs = SmallVec::<[_; 4]>::new(); | ||
|
|
||
| if cx.sess().target.is_like_gpu { | ||
| // Conservatively apply convergent to all functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| if self.target.is_like_gpu { | ||
| // Special care needs to be taken for convergent operations (i.e. not duplicating them). | ||
| // We do not want to handle these, so do not run any optimizations. | ||
| 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a very blunt instrument! Seems like things went from "disable JumpThreading where we know problems can occur" to the much broader "disable everything because problems might occur" without much discussion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FractalFir had a good suggestion: add a new method is_convergent_safe to MirPass, so we can have a more fine-grained control over which passes run. Whether it defaults to returning true or false is an open question; false would be more conservative (e.g. safer when new passes are added) but more verbose because it's true for many passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Flakebi: what do you think about this idea?
GPU targets have convergent operations that require careful handling
when running optimizations. E.g. they must not be duplicated.
An example convergent operation is a barrier/syncthreads.
We do not want to deal with convergent operations in mir optimizations,
so set the optimization level to 0 and skip all optimizations.
On targets with convergent operations, we need to add the convergent
attribute to all functions that run convergent operations. Following
clang, we can conservatively apply the attribute to all functions when
compiling for such a target and rely on LLVM optimizing away the
attribute in cases where it is not necessary.
The amdgpu and nvptx targets are marked as having convergent operations.
Fixes #137086, see this issue for details.
Tracking issue: #135024
cc @RDambrosio016 @kjetilkjeka for nvptx
cc @ZuseZ4