-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Change gtExtractSideEffList to use GenTreeVisitor #18257
Conversation
71b24ce to
6097dc6
Compare
|
@AndyAyersMS Do you happen to know why would assertionprop's |
|
No, I don't know why -- maybe @briansull or @erozenfeld can chime in. |
|
Th CSE phase can eliminate some SIDE_EFFECTS because it can guarantee that the SIDE_EFFECT is done once at the def locations and can be skipped at the use locations. Some but not all, side effect are allowed to be eliminated in this way. The definition of GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE has more details: |
Yes, the CSE is somewhat special in its handling of side effects. But this is about assertion propagation so it's not clear why |
|
For clarity - this is intended to be a fix for #18232 |
|
And the same |
6097dc6 to
7b26c4d
Compare
|
Now, the main use of Somewhat similarly, it is possible to end up with a constant VN even if the tree contains volatile reads, at least in corner cases such as It seems to me that |
Which does not actually help with the case of dropped volatile reads because those aren't side effects, at least not according to |
7b26c4d to
8657e97
Compare
|
The change to mov r9, r12
- mov r10d, dword ptr [r9+8]
lea r10d, [rax+rcx]
+ mov r11d, dword ptr [r9+8]
movsxd r10, r10dThis is the result of a range check being eliminated and old |
|
The change to use GTF_SIDE_EFFECT generates one diff: xor ebp, ebp
mov r13d, dword ptr [rsp+74H]
test r13d, r13d
jle G_M19407_IG19
- mov rax, gword ptr [rsp+60H]
- cmp dword ptr [rax+8], r13d
+ test r13d, r13d
setge cl
movzx rcx, cl
- test r13d, r13d
+ mov eax, dword ptr [r15+8]
+ and ecx, 1
+ mov rax, gword ptr [rsp+60H]
+ cmp dword ptr [rax+8], r13d
setge dl
movzx rdx, dl
- and edx, 1
and ecx, edx
mov rdx, gword ptr [rsp+68H]
cmp dword ptr [rdx+8], r13dThis is somewhere in System.Data.Common - https://github.com/dotnet/corefx/blob/ef03e596a20c87c2793fa31a39cc2bbbb03831dd/src/System.Data.Common/src/System/Data/XMLSchema.cs#L489 The actual change is an extra The Now, since VN managed to evaluated the condition it probably means that an exception can't ever be thrown, the array reference can't be null. But getting this right requires either VN to sort out its exception propagation (presumably related to #8648) or a way to figure out that |
|
I hadn't seen that this wasn't WIP anymore. There are some that still throw, such as this one: // Generated by Fuzzlyn on 2018-06-26 16:32:41
// Seed: 3807292053929899503
// Reduced from 119.1 KiB to 0.2 KiB
// Debug: Runs successfully
// Release: Throws 'System.NullReferenceException'
public class Program
{
static ushort[, ] s_1 = new ushort[, ]{{0}};
public static void Main()
{
bool vr5 = 0 == ((0 % ((0 & s_1[0, 0]) | 1)) * s_1[0, 0]);
}
}Is this a separate issue? |
Uh oh, it's a different issue but in the same area. This PR basically fixes It's really baffling,
I'm surprised that this doesn't blow up more often and more spectacularly! |
|
I have submitted #18648 for that. It's interesting that there are two similar issues in this area. |
|
The above discussed issue is a consequence of changing
The proper solution is likely to pass |
26739e7 to
72fa5fa
Compare
|
@dotnet/jit-contrib This is good for review, although I'm not very happy with the |
|
There are some examples left which could be related: // Generated by Fuzzlyn on 2018-06-26 03:15:42
// Seed: 6043644693530160524
// Reduced from 196.7 KiB to 0.3 KiB
// Debug: Runs successfully
// Release: Throws 'System.IndexOutOfRangeException'
public class Program
{
public static void Main()
{
var vr16 = new ulong[]{18446744073709551614UL};
M5(vr16);
}
static byte M5(ulong[] arg0)
{
int var0 = (ushort)((0 % arg0[0]) | (byte)(-32768 * (int)(0 & arg0[0])));
System.Console.WriteLine(var0);
return 0;
}
}A lot of examples still throw, but a lot of them are probably #18770. |
Hmm, your example works fine with both the master branch and my branch. Am I missing something? |
|
The comparison is done with stack garbage, so it depends on whether the stack slot is zero or not: ; Assembly listing for method Program:M5(ref):ubyte
; Emitting BLENDED_CODE for X64 CPU with AVX
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
; V00 arg0 [V00,T00] ( 23, 23 ) ref -> rcx class-hnd
; V01 OutArgs [V01 ] ( 1, 1 ) lclBlk (32) [rsp+0x00]
; V02 cse0 [V02,T01] ( 12, 12 ) long -> rdx
; V03 cse1 [V03,T02] ( 8, 8 ) int -> [rsp+0x24]
;
; Lcl frame size = 40
G_M23191_IG01:
4883EC28 sub rsp, 40
G_M23191_IG02:
837C242400 cmp dword ptr [rsp+24H], 0 ; that was quick..?
761D jbe SHORT G_M23191_IG04
8B4110 mov eax, dword ptr [rcx+16]
33C0 xor rax, rax
8B5108 mov edx, dword ptr [rcx+8]
33D2 xor rdx, rdx
48F77110 div rdx:rax, qword ptr [rcx+16]
0FB7CA movzx rcx, dx
E8DFFBFFFF call System.Console:WriteLine(int)
33C0 xor eax, eax
G_M23191_IG03:
4883C428 add rsp, 40
C3 ret
G_M23191_IG04:
E8F3D8F25E call CORINFO_HELP_RNGCHKFAIL
CC int3
; Total bytes of code 46, prolog size 4 for method Program:M5(ref):ubyte
; ============================================================EDIT: That looks very weird. I'm guessing a different issue? |
The tree traversal logic in gtExtractSideEffList is basically an incomplete duplicate of GenTreeVisitor's traversal logic. It lacks handling for GT_ARR_ELEM and GT_ARR_OFFSET so if this are encountered any side effects their children may have are silently dropped. In addition, side effect ordering was not always preserved. The comma list is built by prepending nodes to it so side effects have to be visited in reverse execution order. The old code did this only for simple opers, for special nodes such as GT_ARR_BOUNDS_CHECK normal execution order was used. This actually complicates a bit the GenTreeVisitor implementation as side effects need to be first stored in an array. The number of side effects is usually small (<= 4) so this shouldn't be a problem.
Assertion propagation doesn't have any way to ensure that it is safe to remove class constructor calls so GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE should not be used here. Likewise, GTF_EXCEPT side effects must be preserved as well. VN doesn't always propagate exceptions so it's possible to end up with a tree that has a constant VN and contains exception side effects.
Right, missed that. I'll take a look, right now what I can tell for sure is that it doesn't appear to be related to this fix. The same code is generated before and after this change. P.S. Yep, this just another case of use-def re-ordering. It may be worth pointing out that this re-ordering also results in the elimination of the def as a dead store. That's why there's only a single reference to |
CarolEidt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM with one question, but I'd like to see a review from @AndyAyersMS or @briansull
| ArrayStack<GenTree*> m_sideEffects; | ||
|
|
||
| if (flags & GTF_EXCEPT) | ||
| enum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this enum for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's part of the GenTreeVisitor "contract", it enables its various features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid possible naming conflicts we should probably rename all these "enums" to some that is unlikely to have a collision, such as: GTV_DoPreOrder, GTV_UseExexcutionOrder, but that change is out of scope here.
| // DIV/MOD node to a constant and the node may still have GTF_EXCEPT set, | ||
| // even if it does not actually throw any exceptions). | ||
| assert(!gtNodeHasSideEffects(oldTree, GTF_PERSISTENT_SIDE_EFFECTS)); | ||
| ignoreRoot = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to update GTF_EXCEPT flags once value numbering knows a tree is a constant but we currently don't have a convenient place to do that.
Can you change the assert to:
assert(!gtNodeHasSideEffects(oldTree, GTF_EXCEPT) || vnStore->IsVNConstant(oldTree->gtVNPair.GetConservative()));
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to update GTF_EXCEPT flags once value numbering knows a tree is a constant but we currently don't have a convenient place to do that.
Yeah, I've tried something like that but things didn't went too well (e.g. OperMayThrow doesn't pay attention to VNs). Besides, I don't know if VN can be trusted with propagating exceptions, sometimes it packs constants with exceptions and sometimes it does not.
Can you change the assert to:
I'll try, that looks like a reasonable addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a typo in my suggestion. I meant replacing your assert with
assert(!gtNodeHasSideEffects(oldTree, GTF_SIDE_EFFECT) || vnStore->IsVNConstant(oldTree->gtVNPair.GetConservative()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a separate assert for the GTF_EXCEPT | constant case, mostly because it was my addition of GTF_EXCEPT that triggered the need for this special handling. Considering the way the old code was working it seems that it is not possible for nodes with persistent side effects to become constants.
src/jit/gentree.cpp
Outdated
|
|
||
| // These have GTF_ASG but for some reason gtNodeHasSideEffects ignores them. | ||
| // Anyway, they must be preserved, no matter what side effect flags are passed in. | ||
| if (node->OperIs(GT_LOCKADD, GT_XADD, GT_XCHG, GT_CMPXCHG)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use node->OperRequiresAsgFlag() here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simply preserved the original logic here, I need to investigate a bit more to see if such a change is correct.
It's likely the other way around - gtNodeHasSideEffects should use OperRequiresAsgFlag() instead of OperIsAssignment(). It does look like there may be some bugs lurking in here:
- HW intrinsics require the ASG flags but themselves are not assignments
GT_MEMORYBARRIERseems to be ignored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that thee four shoudl be considered as assignments as they do in fact write to memory (atomically):
GT_LOCKADD, GT_XADD, GT_XCHG, GT_CMPXCHG
MemoryBarrier doesn't write to memory so it might not need to be mark with an assignment, but it does need to always be preserved. It may simplify the implementation if it is also considered an atomic memory operations like the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a compromise and replaced OperIs(GT_LOCKADD...) with OperIsAtomicOp() since it already exists and it's used in a number of places. At the same time, I've added TODO-Cleanup comments about the whole OperRequiresAsgFlag situation.
MemoryBarrier doesn't write to memory so it might not need to be mark with an assignment, but it does need to always be preserved. It may simplify the implementation if it is also considered an atomic memory operations like the others.
Yep. And I wonder if we shouldn't also treat volatile indirs as assignment side effects as well, they're after all half memory barriers and currently they are dropped - #18748
| } | ||
| if (op2) | ||
|
|
||
| if (m_compiler->gtNodeHasSideEffects(node, m_flags)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For GTF_ASG and GTF_EXCEPT, gtNodeSideEffects only checks the node, not its children; however, for some reason for GTF_CALL it also checks the arguments. So, here we may add a call node without side effects instead of just adding the arguments with side effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, saw that but then my change simply preserves the original logic, or at least that's the intent. The original code traversed calls just like GenTreeVisitor does now. Am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your change preserves the original logic. I just added a note for future clean-up.
72fa5fa to
b96745e
Compare
erozenfeld
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
briansull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good.
Thanks for fixing this area.
| // memory stores. Atomic ops have special handling in gtExtractSideEffList but the others | ||
| // will simply be dropped is they are ever subject to an "extract side effects" operation. | ||
| // It is possible that the reason no bugs have yet been observed in this area is that the | ||
| // other nodes are likely to always be tree roots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should add tree->OperIsAtomicOp() and GT_MEMORYBARRIER to the return true case.
This can be done in a follow up, if it desired that this be a minimal diff PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll leave this for another PR. It would be good to investigate moving the atomic check from gtExtractSideEffList to gtNodeHasSideEffects as well and that can be done at the same time.
| ArrayStack<GenTree*> m_sideEffects; | ||
|
|
||
| if (flags & GTF_EXCEPT) | ||
| enum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid possible naming conflicts we should probably rename all these "enums" to some that is unlikely to have a collision, such as: GTV_DoPreOrder, GTV_UseExexcutionOrder, but that change is out of scope here.
* Change gtExtractSideEffList to use GenTreeVisitor The tree traversal logic in gtExtractSideEffList is basically an incomplete duplicate of GenTreeVisitor's traversal logic. It lacks handling for GT_ARR_ELEM and GT_ARR_OFFSET so if this are encountered any side effects their children may have are silently dropped. In addition, side effect ordering was not always preserved. The comma list is built by prepending nodes to it so side effects have to be visited in reverse execution order. The old code did this only for simple opers, for special nodes such as GT_ARR_BOUNDS_CHECK normal execution order was used. This actually complicates a bit the GenTreeVisitor implementation as side effects need to be first stored in an array. The number of side effects is usually small (<= 4) so this shouldn't be a problem. * Use GTF_SIDE_EFFECT in optPrepareTreeForReplacement Assertion propagation doesn't have any way to ensure that it is safe to remove class constructor calls so GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE should not be used here. Likewise, GTF_EXCEPT side effects must be preserved as well. VN doesn't always propagate exceptions so it's possible to end up with a tree that has a constant VN and contains exception side effects. * Add tests for 18232 * CR feedback
This changes
gtExtractSideEffListto useGenTreeVisitorinstead of custom tree traversal logic. The existing code did not handleGT_ARR_ELEMandGT_ARR_OFFSETand did not always preserve side effect ordering.Additionally, this changes
optPrepareTreeForReplacementto useGTF_SIDE_EFFECTinstead ofGTF_PERSISTENT_SIDE_EFFECTS_IN_CSEto avoid dropping exception side effects and class constructor calls.More details in commit message and discussion below.
Unfortunately the change to use
GenTreeVisitorslows down things a bit - around 0.05% instructions retired. But I don't think such a small regression can justify keeping the custom/duplicate tree traversal logic, especially if it can lead to bugs. Besides, there are plenty of places in the JIT where one can recover 0.05% without compromising code reliability and clarity.Fixes #18232