Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@mikedn
Copy link

@mikedn mikedn commented Jun 2, 2018

This changes gtExtractSideEffList to use GenTreeVisitor instead of custom tree traversal logic. The existing code did not handle GT_ARR_ELEM and GT_ARR_OFFSET and did not always preserve side effect ordering.

Additionally, this changes optPrepareTreeForReplacement to use GTF_SIDE_EFFECT instead of GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE to avoid dropping exception side effects and class constructor calls.

More details in commit message and discussion below.

Unfortunately the change to use GenTreeVisitor slows down things a bit - around 0.05% instructions retired. But I don't think such a small regression can justify keeping the custom/duplicate tree traversal logic, especially if it can lead to bugs. Besides, there are plenty of places in the JIT where one can recover 0.05% without compromising code reliability and clarity.

Fixes #18232

@mikedn mikedn force-pushed the fix-side-effect-extract branch from 71b24ce to 6097dc6 Compare June 8, 2018 15:24
@mikedn
Copy link
Author

mikedn commented Jun 11, 2018

@AndyAyersMS Do you happen to know why would assertionprop's optPrepareTreeForReplacement use GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE to extract the replaced tree's side effects? This function is used only during VN based assertion propagation so it's not clear why it cares about CSE.

@AndyAyersMS
Copy link
Member

No, I don't know why -- maybe @briansull or @erozenfeld can chime in.

@briansull
Copy link

Th CSE phase can eliminate some SIDE_EFFECTS because it can guarantee that the SIDE_EFFECT is done once at the def locations and can be skipped at the use locations. Some but not all, side effect are allowed to be eliminated in this way.

The definition of GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE has more details:

// The extra flag GTF_IS_IN_CSE is used to tell the consumer of these flags
// that we are calling in the context of performing a CSE, thus we
// should allow the run-once side effects of running a class constructor.
//
// The only requirement of this flag is that it not overlap any of the
// side-effect flags. The actual bit used is otherwise arbitrary.
#define GTF_IS_IN_CSE GTF_BOOLEAN
#define GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE (GTF_ASG | GTF_CALL | GTF_IS_IN_CSE)

@mikedn
Copy link
Author

mikedn commented Jun 11, 2018

Th CSE phase can eliminate some SIDE_EFFECTS because it can guarantee that the SIDE_EFFECT is done once at the def locations and can be skipped at the use locations. Some but not all, side effect are allowed to be eliminated in this way.

Yes, the CSE is somewhat special in its handling of side effects. But this is about assertion propagation so it's not clear why GTF_IS_IN_CSE is used here and it appears to contradict the comment that says "we are calling in the context of performing a CSE".

@mikedn
Copy link
Author

mikedn commented Jun 11, 2018

For clarity - this is intended to be a fix for #18232

@mikedn
Copy link
Author

mikedn commented Jun 11, 2018

And the same optPrepareTreeForReplacement is also responsible for dropping some volatile reads, probably because it doesn't specify GTF_ORDER_SIDEEFF.

@mikedn mikedn force-pushed the fix-side-effect-extract branch from 6097dc6 to 7b26c4d Compare June 12, 2018 16:30
@mikedn
Copy link
Author

mikedn commented Jun 13, 2018

GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE allows gtExtractSideEffList to ignore .cctor calls. This is valid only if the caller somehow knows that such calls are redundant. CSE does have this knowledge when removing use trees because it is expected that the def tree will also include the necessary .cctor call. No other gtExtractSideEffList caller seem to have this knowledge (including assertion propagation) so that's probably why the flag refers specifically to CSE rather than saying something like "IGNORE_CCTORS".

Now, the main use of optPrepareTreeForReplacement is to drop trees that VN says that are constants. A .cctor call normally carries an exception side effect so one might expect that the containing tree will never end up being a VN constant and thus using GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE makes no difference. But VN doesn't consistently propagate exceptions so it's actually possible to end up with a constant VN on a tree having exception side effects and .cctor calls. So optPrepareTreeForReplacement must not use GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE and at the same time it needs to include GTF_EXCEPT, otherwise we'll end up dropping exceptions.

Somewhat similarly, it is possible to end up with a constant VN even if the tree contains volatile reads, at least in corner cases such as x & 0.

It seems to me that optPrepareTreeForReplacement really needs to use GTF_ALL_EFFECT because nothing guarantees that the tree to be replaced contains only persistent side effects.

@mikedn
Copy link
Author

mikedn commented Jun 13, 2018

It seems to me that optPrepareTreeForReplacement really needs to use GTF_ALL_EFFECT because nothing guarantees that the tree to be replaced contains only persistent side effects.

Which does not actually help with the case of dropped volatile reads because those aren't side effects, at least not according to gtNodeHasSideEffects. So in the end it just needs to be GTF_SIDE_EFFECT. Oh well, side effects are a horror story...

@mikedn mikedn force-pushed the fix-side-effect-extract branch from 7b26c4d to 8657e97 Compare June 13, 2018 18:44
@mikedn
Copy link
Author

mikedn commented Jun 13, 2018

The change to GenTreeVisitor causes 2 similar differences in System.Runtime.Serialization.Formatters:

        mov      r9, r12
-       mov      r10d, dword ptr [r9+8]
        lea      r10d, [rax+rcx]
+       mov      r11d, dword ptr [r9+8]
        movsxd   r10, r10d

This is the result of a range check being eliminated and old gtExtractSideEffList failing to preserve the original side effect ordering.

@mikedn
Copy link
Author

mikedn commented Jun 14, 2018

The change to use GTF_SIDE_EFFECT generates one diff:

        xor      ebp, ebp
        mov      r13d, dword ptr [rsp+74H]
        test     r13d, r13d
        jle      G_M19407_IG19
-       mov      rax, gword ptr [rsp+60H]
-       cmp      dword ptr [rax+8], r13d
+       test     r13d, r13d
        setge    cl
        movzx    rcx, cl
-       test     r13d, r13d
+       mov      eax, dword ptr [r15+8]
+       and      ecx, 1
+       mov      rax, gword ptr [rsp+60H]
+       cmp      dword ptr [rax+8], r13d
        setge    dl
        movzx    rdx, dl
-       and      edx, 1
        and      ecx, edx
        mov      rdx, gword ptr [rsp+68H]
        cmp      dword ptr [rdx+8], r13d

This is somewhere in System.Data.Common - https://github.com/dotnet/corefx/blob/ef03e596a20c87c2793fa31a39cc2bbbb03831dd/src/System.Data.Common/src/System/Data/XMLSchema.cs#L489

The actual change is an extra mov eax, dword ptr [r15+8]. The rest of changes are the result of re-ordering.

The for loop got cloned and then one of conditions got evaluated to true by VN. The lack of GTF_EXCEPT resulted in a GT_ARR_LENGTH node being dropped.

Now, since VN managed to evaluated the condition it probably means that an exception can't ever be thrown, the array reference can't be null. But getting this right requires either VN to sort out its exception propagation (presumably related to #8648) or a way to figure out that GT_ARR_LENGTH can't throw in this case (e.g fgAddrCouldBeNull could perhaps use ValueNumStore::IsKnownNonNull).

@mikedn mikedn changed the title [WIP] Change gtExtractSideEffList to use GenTreeVisitor Change gtExtractSideEffList to use GenTreeVisitor Jun 14, 2018
@jakobbotsch
Copy link
Member

I hadn't seen that this wasn't WIP anymore.
This fixes a whole bunch of cases:
jakobbotsch/Fuzzlyn@4b1fa07

There are some that still throw, such as this one:

// Generated by Fuzzlyn on 2018-06-26 16:32:41
// Seed: 3807292053929899503
// Reduced from 119.1 KiB to 0.2 KiB
// Debug: Runs successfully
// Release: Throws 'System.NullReferenceException'
public class Program
{
    static ushort[, ] s_1 = new ushort[, ]{{0}};
    public static void Main()
    {
        bool vr5 = 0 == ((0 % ((0 & s_1[0, 0]) | 1)) * s_1[0, 0]);
    }
}

Is this a separate issue?

@mikedn
Copy link
Author

mikedn commented Jun 26, 2018

There are some that still throw, such as this one:

Uh oh, it's a different issue but in the same area. This PR basically fixes optPrepareTreeForReplacement (by fixing gtExtractSideEffList) but then it turns out that there are cases where optPrepareTreeForReplacement is misused.

It's really baffling, optVNConstantPropOnTree calls optPrepareTreeForReplacement(tree, tree) and then changes tree to a constant. In this case

  • tree is MOD so itself ends up as a side effect
  • tree is changed to a constant so the entire side effect subtree simply vanishes
  • newTree is a COMMA where both op1 and op2 point to the same constant node

I'm surprised that this doesn't blow up more often and more spectacularly!

@jakobbotsch
Copy link
Member

I have submitted #18648 for that. It's interesting that there are two similar issues in this area.
It required changes in a couple of examples too: jakobbotsch/Fuzzlyn@b5113d2

@mikedn mikedn changed the title Change gtExtractSideEffList to use GenTreeVisitor [WIP] Change gtExtractSideEffList to use GenTreeVisitor Jun 26, 2018
@mikedn
Copy link
Author

mikedn commented Jun 27, 2018

The above discussed issue is a consequence of changing optPrepareTreeForReplacement to use GTF_SIDE_EFFECT instead of GTF_PERSISTENT_SIDE_EFFECTS. Nodes that can end up evaluated to constants can't have persistent side effects (they're not assignments/calls) but they may have exception side effects (GTF_EXCEPT) even if they never throw (it can't evaluate to a constant and throw at the same time).

GTF_SIDE_EFFECT is needed because the children of the constant node may throw exceptions (e.g. s_1[0, 0] can throw IndexOutOfRangeException).

The proper solution is likely to pass ignoreRoot to gtExtractSideEffList so that the constant node is never extracted as a side effect.

@mikedn mikedn force-pushed the fix-side-effect-extract branch 2 times, most recently from 26739e7 to 72fa5fa Compare July 14, 2018 07:51
@mikedn mikedn changed the title [WIP] Change gtExtractSideEffList to use GenTreeVisitor Change gtExtractSideEffList to use GenTreeVisitor Jul 15, 2018
@mikedn
Copy link
Author

mikedn commented Jul 17, 2018

@dotnet/jit-contrib This is good for review, although I'm not very happy with the newTree == oldTree special handling in optPrepareTreeForReplacement. But I can't think of a better solution, other than simply creating new constant nodes instead of attempting to change existing nodes.

@jakobbotsch
Copy link
Member

There are some examples left which could be related:

// Generated by Fuzzlyn on 2018-06-26 03:15:42
// Seed: 6043644693530160524
// Reduced from 196.7 KiB to 0.3 KiB
// Debug: Runs successfully
// Release: Throws 'System.IndexOutOfRangeException'
public class Program
{
    public static void Main()
    {
        var vr16 = new ulong[]{18446744073709551614UL};
        M5(vr16);
    }

    static byte M5(ulong[] arg0)
    {
        int var0 = (ushort)((0 % arg0[0]) | (byte)(-32768 * (int)(0 & arg0[0])));
        System.Console.WriteLine(var0);
        return 0;
    }
}

A lot of examples still throw, but a lot of them are probably #18770.

@mikedn
Copy link
Author

mikedn commented Jul 17, 2018

There are some examples left which could be related:

Hmm, your example works fine with both the master branch and my branch. Am I missing something?

@jakobbotsch
Copy link
Member

jakobbotsch commented Jul 17, 2018

The comparison is done with stack garbage, so it depends on whether the stack slot is zero or not:

; Assembly listing for method Program:M5(ref):ubyte
; Emitting BLENDED_CODE for X64 CPU with AVX
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 arg0         [V00,T00] ( 23, 23   )     ref  ->  rcx         class-hnd
;  V01 OutArgs      [V01    ] (  1,  1   )  lclBlk (32) [rsp+0x00]
;  V02 cse0         [V02,T01] ( 12, 12   )    long  ->  rdx
;  V03 cse1         [V03,T02] (  8,  8   )     int  ->  [rsp+0x24]
;
; Lcl frame size = 40

G_M23191_IG01:
       4883EC28             sub      rsp, 40

G_M23191_IG02:
       837C242400           cmp      dword ptr [rsp+24H], 0 ; that was quick..?
       761D                 jbe      SHORT G_M23191_IG04
       8B4110               mov      eax, dword ptr [rcx+16]
       33C0                 xor      rax, rax
       8B5108               mov      edx, dword ptr [rcx+8]
       33D2                 xor      rdx, rdx
       48F77110             div      rdx:rax, qword ptr [rcx+16]
       0FB7CA               movzx    rcx, dx
       E8DFFBFFFF           call     System.Console:WriteLine(int)
       33C0                 xor      eax, eax

G_M23191_IG03:
       4883C428             add      rsp, 40
       C3                   ret

G_M23191_IG04:
       E8F3D8F25E           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3

; Total bytes of code 46, prolog size 4 for method Program:M5(ref):ubyte
; ============================================================

EDIT: That looks very weird. I'm guessing a different issue?

mikedn added 3 commits July 17, 2018 21:05
The tree traversal logic in gtExtractSideEffList is basically an incomplete duplicate of GenTreeVisitor's traversal logic. It lacks handling for GT_ARR_ELEM and GT_ARR_OFFSET so if this are encountered any side effects their children may have are silently dropped.

In addition, side effect ordering was not always preserved. The comma list is built by prepending nodes to it so side effects have to be visited in reverse execution order. The old code did this only for simple opers, for special nodes such as GT_ARR_BOUNDS_CHECK normal execution order was used.

This actually complicates a bit the GenTreeVisitor implementation as side effects need to be first stored in an array. The number of side effects is usually small (<= 4) so this shouldn't be a problem.
Assertion propagation doesn't have any way to ensure that it is safe to remove class constructor calls so GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE should not be used here.

Likewise, GTF_EXCEPT side effects must be preserved as well. VN doesn't always propagate exceptions so it's possible to end up with a tree that has a constant VN and contains exception side effects.
@mikedn
Copy link
Author

mikedn commented Jul 17, 2018

The comparison is done with stack garbage, so it depends on whether the stack slot is zero or not:

Right, missed that. I'll take a look, right now what I can tell for sure is that it doesn't appear to be related to this fix. The same code is generated before and after this change.

P.S.

Yep, this just another case of use-def re-ordering. It may be worth pointing out that this re-ordering also results in the elimination of the def as a dead store. That's why there's only a single reference to [rsp+24h] in the entire code.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM with one question, but I'd like to see a review from @AndyAyersMS or @briansull

ArrayStack<GenTree*> m_sideEffects;

if (flags & GTF_EXCEPT)
enum

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this enum for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's part of the GenTreeVisitor "contract", it enables its various features.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid possible naming conflicts we should probably rename all these "enums" to some that is unlikely to have a collision, such as: GTV_DoPreOrder, GTV_UseExexcutionOrder, but that change is out of scope here.

// DIV/MOD node to a constant and the node may still have GTF_EXCEPT set,
// even if it does not actually throw any exceptions).
assert(!gtNodeHasSideEffects(oldTree, GTF_PERSISTENT_SIDE_EFFECTS));
ignoreRoot = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to update GTF_EXCEPT flags once value numbering knows a tree is a constant but we currently don't have a convenient place to do that.
Can you change the assert to:

assert(!gtNodeHasSideEffects(oldTree, GTF_EXCEPT) || vnStore->IsVNConstant(oldTree->gtVNPair.GetConservative()));

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to update GTF_EXCEPT flags once value numbering knows a tree is a constant but we currently don't have a convenient place to do that.

Yeah, I've tried something like that but things didn't went too well (e.g. OperMayThrow doesn't pay attention to VNs). Besides, I don't know if VN can be trusted with propagating exceptions, sometimes it packs constants with exceptions and sometimes it does not.

Can you change the assert to:

I'll try, that looks like a reasonable addition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a typo in my suggestion. I meant replacing your assert with

assert(!gtNodeHasSideEffects(oldTree, GTF_SIDE_EFFECT) || vnStore->IsVNConstant(oldTree->gtVNPair.GetConservative())); 

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a separate assert for the GTF_EXCEPT | constant case, mostly because it was my addition of GTF_EXCEPT that triggered the need for this special handling. Considering the way the old code was working it seems that it is not possible for nodes with persistent side effects to become constants.


// These have GTF_ASG but for some reason gtNodeHasSideEffects ignores them.
// Anyway, they must be preserved, no matter what side effect flags are passed in.
if (node->OperIs(GT_LOCKADD, GT_XADD, GT_XCHG, GT_CMPXCHG))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use node->OperRequiresAsgFlag() here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply preserved the original logic here, I need to investigate a bit more to see if such a change is correct.

It's likely the other way around - gtNodeHasSideEffects should use OperRequiresAsgFlag() instead of OperIsAssignment(). It does look like there may be some bugs lurking in here:

  • HW intrinsics require the ASG flags but themselves are not assignments
  • GT_MEMORYBARRIER seems to be ignored

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that thee four shoudl be considered as assignments as they do in fact write to memory (atomically):

GT_LOCKADD, GT_XADD, GT_XCHG, GT_CMPXCHG

MemoryBarrier doesn't write to memory so it might not need to be mark with an assignment, but it does need to always be preserved. It may simplify the implementation if it is also considered an atomic memory operations like the others.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a compromise and replaced OperIs(GT_LOCKADD...) with OperIsAtomicOp() since it already exists and it's used in a number of places. At the same time, I've added TODO-Cleanup comments about the whole OperRequiresAsgFlag situation.

MemoryBarrier doesn't write to memory so it might not need to be mark with an assignment, but it does need to always be preserved. It may simplify the implementation if it is also considered an atomic memory operations like the others.

Yep. And I wonder if we shouldn't also treat volatile indirs as assignment side effects as well, they're after all half memory barriers and currently they are dropped - #18748

}
if (op2)

if (m_compiler->gtNodeHasSideEffects(node, m_flags))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GTF_ASG and GTF_EXCEPT, gtNodeSideEffects only checks the node, not its children; however, for some reason for GTF_CALL it also checks the arguments. So, here we may add a call node without side effects instead of just adding the arguments with side effects.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, saw that but then my change simply preserves the original logic, or at least that's the intent. The original code traversed calls just like GenTreeVisitor does now. Am I missing something here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your change preserves the original logic. I just added a note for future clean-up.

@mikedn mikedn force-pushed the fix-side-effect-extract branch from 72fa5fa to b96745e Compare July 18, 2018 17:20
Copy link
Member

@erozenfeld erozenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good.
Thanks for fixing this area.

// memory stores. Atomic ops have special handling in gtExtractSideEffList but the others
// will simply be dropped is they are ever subject to an "extract side effects" operation.
// It is possible that the reason no bugs have yet been observed in this area is that the
// other nodes are likely to always be tree roots.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should add tree->OperIsAtomicOp() and GT_MEMORYBARRIER to the return true case.

This can be done in a follow up, if it desired that this be a minimal diff PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll leave this for another PR. It would be good to investigate moving the atomic check from gtExtractSideEffList to gtNodeHasSideEffects as well and that can be done at the same time.

ArrayStack<GenTree*> m_sideEffects;

if (flags & GTF_EXCEPT)
enum

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid possible naming conflicts we should probably rename all these "enums" to some that is unlikely to have a collision, such as: GTV_DoPreOrder, GTV_UseExexcutionOrder, but that change is out of scope here.

@erozenfeld erozenfeld merged commit 1f28125 into dotnet:master Jul 23, 2018
jashook pushed a commit to jashook/coreclr that referenced this pull request Aug 14, 2018
* Change gtExtractSideEffList to use GenTreeVisitor

The tree traversal logic in gtExtractSideEffList is basically an incomplete duplicate of GenTreeVisitor's traversal logic. It lacks handling for GT_ARR_ELEM and GT_ARR_OFFSET so if this are encountered any side effects their children may have are silently dropped.

In addition, side effect ordering was not always preserved. The comma list is built by prepending nodes to it so side effects have to be visited in reverse execution order. The old code did this only for simple opers, for special nodes such as GT_ARR_BOUNDS_CHECK normal execution order was used.

This actually complicates a bit the GenTreeVisitor implementation as side effects need to be first stored in an array. The number of side effects is usually small (<= 4) so this shouldn't be a problem.

* Use GTF_SIDE_EFFECT in optPrepareTreeForReplacement

Assertion propagation doesn't have any way to ensure that it is safe to remove class constructor calls so GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE should not be used here.

Likewise, GTF_EXCEPT side effects must be preserved as well. VN doesn't always propagate exceptions so it's possible to end up with a tree that has a constant VN and contains exception side effects.

* Add tests for 18232

* CR feedback
@mikedn mikedn deleted the fix-side-effect-extract branch September 28, 2019 19:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants