JIT: Move constant vectors to op2 from op1 #67502

EgorBo · 2022-04-03T14:50:41Z

A quick fix for #67500

bool Test(Vector<ushort> v) => Vector<ushort>.Zero.Equals(v);

codegen diff:

; Assembly listing for method Tests:Test(System.Numerics.Vector`1[UInt16]):bool:this
G_M10888_IG01:  
       C5F877               vzeroupper 
G_M10888_IG02:   
-      C5FC57C0             vxorps   ymm0, ymm0, ymm0
-      C5FDEF02             vpxor    ymm0, ymm0, ymmword ptr[rdx]
+      C5FD1002             vmovupd  ymm0, ymmword ptr[rdx]
       C4E27D17C0           vptest   ymm0, ymm0
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
G_M10888_IG03:
       C5F877               vzeroupper 
       C3                   ret     
-; Total bytes of code 26
+; Total bytes of code 22

Diffs

ghost · 2022-04-03T14:50:50Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

A quick fix for #67500

bool Test(Vector<ushort> v) => Vector<ushort>.Zero.Equals(v);

codegen diff:

; Assembly listing for method Tests:Test(System.Numerics.Vector`1[UInt16]):bool:this
G_M10888_IG01:  
       C5F877               vzeroupper 
G_M10888_IG02:   
-      C5FC57C0             vxorps   ymm0, ymm0, ymm0
-      C5FDEF02             vpxor    ymm0, ymm0, ymmword ptr[rdx]
+      C5FD1002             vmovupd  ymm0, ymmword ptr[rdx]
       C4E27D17C0           vptest   ymm0, ymm0
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
G_M10888_IG03:
       C5F877               vzeroupper 
       C3                   ret     
-; Total bytes of code 26
+; Total bytes of code 22

Author:	EgorBo
Assignees:	EgorBo
Labels:	`area-CodeGen-coreclr`
Milestone:	-

tannergooding · 2022-04-03T15:25:38Z

src/coreclr/jit/morph.cpp

+        // Move constant vectors from op1 to op2 for commutative and compare operations
+        // For now we only do it for zero vector


This feels like something we really should just be handling in lowering when doing the relevant checks...

Why is it needed in morph instead?

Morph is the place where we perform canonicalizations like this one. It benefits the rest of the compiler (e. g. VN).

This feels like something we really should just be handling in lowering when doing the relevant checks...

Why is it needed in morph instead?

I tried that first but it was more verbose (LIR changes are always quite verbose) and yeah, just like @SingleAccretion mentioned it's a sort of an canonization that we also do for scalars in morph already

I tried that first but it was more verbose (LIR changes are always quite verbose) and yeah, just like @SingleAccretion mentioned it's a sort of an canonization that we also do for scalars in morph already

I'm not sure why it would be more verbose? I would expect we just add the relevant bit right here: https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/lowerxarch.cpp#L1243

That is, today we are only checking op2. We should just need to insert if (op1->IsIntegralConstVector(0)) { std::swap(op1, op2); }

just like @SingleAccretion mentioned it's a sort of an canonization that we also do for scalars in morph already

I understand why we might want to do some canonization in morph. However, this is going to miss opts and other transforms that might happen later.

We really should have something in lowering that tries to handle it as well, particularly where its trivial to do so. And particularly since lowering already needs to be concerned with commutativity, with swapping operands where appropriate, etc

However, this is going to miss opts and other transforms that might happen later.

I think until we have cases that show this, we should not be duplicating code between morph and lower.

Morph is already called from pretty much every optimization phase that can introduce new constants into the IR (currently for vectors that can only happen in global assertion propagation).

morph is the transform pass for HIR. lowering is basically the equivalent for LIR and does transforms necessary or appropriate for lsra/codegen.

It will itself introduce new constant nodes, swap operands, do containment checks, etc.

I'm not saying everything needs to do this. Just that both morph and lowering should be and we are likely going to be missing optimizations without it. At the very least, if its not also done in lowering, an assert should exist in lowering to help validate that people don't accidentally introduce Zero as the LHS elsewhere or in the future.

I'm not saying everything needs to do this. Just that both morph and lowering should be and we are likely going to be missing optimizations without it

Lowering is "just another" downstream phase. There is nothing special about it that does not apply to other post-morph phases. For example, the optimization phases are potentially much more impactful to what the end code will look like, and they rely on canonicalization just fine. There is no reason lowering should not.

If we see that optimizations are missed in some cases, we should ensure whatever is creating/transforming IR does that in a canonicality-friendly fashion.

I would be ok with an assert, however, we should be mindful of the fact that such asserts tend to fail under stress where we want non-canonical IR to exist and be tested.

Lowering is "just another" downstream phase

Lowering is, by definition, a transform phase that deals with converting high level IR into the low level IR needed for generation.

As part of this, it has to deal with swapping operands and dealing with various IR concepts that almost no other phase sees. It's likewise the last phase before register allocation at which point the only thing left we can do is minor peephole optimizations, so nothing else runs after it really.

Its not just another phase IMO, its the place that codegen optimizations like this happen traditionally speaking.

Morph will never be able to get them all. It runs far too early and there is no guarantee canonicalization actually occurs.

We've had a ton of good codegen improvements in recent releases specifically because lowering is doing them and catching them. I expect that we'd end up regressing things if we removed those opts and tried moving them up to morph instead.

Morph will never be able to get them all.

This argument can apply to any phase, all of them are potentially "missing out".

I am saying that's a problem for which the solution is not to teach each and every phase to work with each and every IR shape, but to ensure they all see a single one, unless not possible.

Special-casing lowering is not needed unless it is un-canonicalizing the IR, which it does not do in this case.

Morph will never be able to get them all.

True. But we're not after perfection, we should just ensure the cases we care about work as expected. For example: we do not care if a stress mode makes us swap operands backwards and now PTEST isn't recognized.

Practically, for this change, I do not see a possibility that we'd miss canonicalization, for the reason I mentioned above (the only phase that can discover new vector constants after morph is global assertion propagation and that calls morph).

SingleAccretion · 2022-04-03T15:22:40Z

src/coreclr/jit/morph.cpp

    {
        GenTreeHWIntrinsic* hw = multiOp->AsHWIntrinsic();
+
+        // Move constant vectors from op1 to op2 for commutative and compare operations


It would be better to move this inside fgOptimizeHWIntrinsic (along with the XOR transform).

src/coreclr/jit/morph.cpp

…md-ops

EgorBo · 2022-04-13T21:49:56Z

@tannergooding can we merge it as is for now - it's 4 lines change and it has some diffs already 🙂

tannergooding · 2022-04-13T22:00:25Z

I don't see any blockers. I just still think that we should be also handling this in lowering (and that could be handled/investigated in a separate PR).

tannergooding · 2022-05-19T15:11:56Z

Just realized the reason this couldn't be merged is because no one had ever given the green check. Didn't mean to block this with my feedback.

This should probably rerun CI before getting merged

…md-ops

move const vectors to op2

e14f3a5

ghost assigned EgorBo Apr 3, 2022

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 3, 2022

tannergooding reviewed Apr 3, 2022

View reviewed changes

SingleAccretion reviewed Apr 3, 2022

View reviewed changes

EgorBo added 3 commits April 3, 2022 22:11

Address feedback

363132c

Merge branch 'main' of https://github.com/dotnet/runtime into swap-si…

34c737e

…md-ops

fix build

1b894c5

gfoidl mentioned this pull request Apr 4, 2022

Vector<T>.Zero.Equals(vec) produces less ideal code than vec == Vector<T>.Zero #67500

Closed

tannergooding approved these changes May 19, 2022

View reviewed changes

Merge branch 'main' of https://github.com/dotnet/runtime into swap-si…

2f810f8

…md-ops

EgorBo merged commit 3d63499 into dotnet:main May 30, 2022

ghost locked as resolved and limited conversation to collaborators Jun 29, 2022

		// Move constant vectors from op1 to op2 for commutative and compare operations
		// For now we only do it for zero vector

JIT: Move constant vectors to op2 from op1 #67502

JIT: Move constant vectors to op2 from op1 #67502

Uh oh!

Conversation

EgorBo commented Apr 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Apr 3, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SingleAccretion Apr 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tannergooding Apr 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SingleAccretion Apr 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

EgorBo commented Apr 13, 2022

Uh oh!

tannergooding commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tannergooding commented May 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

EgorBo commented Apr 3, 2022 •

edited

Loading

SingleAccretion Apr 3, 2022 •

edited

Loading

tannergooding Apr 3, 2022 •

edited

Loading

SingleAccretion Apr 3, 2022 •

edited

Loading

tannergooding commented Apr 13, 2022 •

edited

Loading