Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix GCStress failures from hardware intrinsic tests#17314

Merged
CarolEidt merged 1 commit intodotnet:masterfrom
fiigii:fixgctests
Mar 30, 2018
Merged

Fix GCStress failures from hardware intrinsic tests#17314
CarolEidt merged 1 commit intodotnet:masterfrom
fiigii:fixgctests

Conversation

@fiigii
Copy link

@fiigii fiigii commented Mar 29, 2018


Unsafe.CopyBlockUnaligned(ref Unsafe.AsRef<byte>(inArray1Ptr), ref Unsafe.As<TOp1, byte>(ref inArray1[0]), this.simdSize);
Unsafe.CopyBlockUnaligned(ref Unsafe.AsRef<byte>(inArray2Ptr), ref Unsafe.As<TOp2, byte>(ref inArray2[0]), this.simdSize);
Unsafe.CopyBlockUnaligned(ref Unsafe.AsRef<byte>(inArray1Ptr), ref Unsafe.As<TOp1, byte>(ref inArray1[0]), inArray1.Length * sizeof(TOp1));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me why the old code was bad and the new code is correct, the array length is simdSize * 2 so it seems that copying simdSize bytes should have worked correctly. Besides, I doubt that sizeof(TOp1) will compile.

Copy link
Author

@fiigii fiigii Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem was the size of arguments (inArray1, inArray2, outArray, etc) could be different in AVX/AVX2 intrinsics.

the array length is simdSize * 2

simdSize * 2 is the length of buffers (the members) rather than input arrays (the arguments).

Besides, I doubt that sizeof(TOp1) will compile.

I tried, it works well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this file is plain bogus...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simdSize * 2 is the length of buffers (the members) rather than input arrays (the arguments).

Right. And that is odd since the buffers end up being smaller than the input arrays. I assume that simdSize is the size of a vector.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sizeof(TOp1) does not work...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is the largest alignment boundary.

Isn't that the same thing as the vector size? SSE2 vector size is 16 bytes and the required alignment is 16 bytes. CopyBlockUnaligned copies bytes the old code copied 16 bytes. How could the input arrays be smaller than 16 bytes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any given test, simdSize should be the size of the largest Vector used (so 16 or 32, currently).

We then allocate 2 * simdSize so that we can guarantee the buffer is aligned (since you need at most alignment - 1 additional bytes to guarantee alignment).

In the case where one argument is Vector128 and one argument is Vector256, simdSize will be larger than the input array for the Vector128 args, and will result in an AV.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except you don't usually get an AV. The code just keeps copying whatever random bytes there are past the end of the array. So the failure only shows up in GC stress when we break in just after the copy and find there is a byref and it happens to be pointing into free space.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where one argument is Vector128 and one argument is Vector256, simdSize will be larger than the input array for the Vector128 args, and will result in an AV.

I see. Well, this code is a mess.

  • The DataTable name is misleading, this whole class isn't some sort of data table, it's just some kind of wrapper for some aligned memory buffers (and it's not clear why instead of simply having a AlignedArray<T> class we have instead this set of classes where the logic is duplicated).
  • The simdSize parameter should have been called alignment, that's its main purpose here.
  • There are no asserts that would suggest that the input arrays are supposed to contain only one vector and thus be smaller than simdSize.
  • In the absence of such asserts the length of the new arrays should have been calculated based on the parameter arrays, not based on simdSize.
  • The out array parameter is not used
  • The way alignment is done is sort of unusual. Pretty much every alignment function I've seen does (x + (alignment - 1)) & ~(alignment - 1).
  • It probably would have been simpler to just allocate memory using Marshal.AllocCoTaskMem instead of dragging in GC handles as well

The actual test code isn't any better. Tons of repeated code, unsafe copies and whatnot. I take it that people were in a hurry to get things done and this is the result.

@fiigii
Copy link
Author

fiigii commented Mar 29, 2018

test Windows_NT x86 Checked gcstress0xc_jitstress1
test Windows_NT x86 Checked gcstress0xc_jitstress2

@fiigii
Copy link
Author

fiigii commented Mar 29, 2018

@tannergooding
Copy link
Member

tannergooding commented Mar 29, 2018

@fiigii, we fixed this in the other templates by doing (uint)Unsafe.SizeOf<{Op1VectorType}<{Op1BaseType}>>() (in the template).

We should probably do the same here.

Edit: Actually, we don't have the Op1VectorType here, so this is probably the closest we will get...
Maybe it is worth asserting that the number of bytes copied is less than or equal to the size of the allocated block?

@AndyAyersMS
Copy link
Member

With these changes I can see that the copy won't read of the end of the parameter inArray but what guarantee is there now that the copy won't write past the end of the local inArray?

Seems like we are relying on the callers to pass reasonable values for parameters and I suspect as people try and add tests they will get this wrong sometimes.

A few lines of validation checking can catch these issues without having to debug through a gc stress failure. So I suggest adding some here.

@fiigii
Copy link
Author

fiigii commented Mar 29, 2018

Addressed feedback. @AndyAyersMS @mikedn @tannergooding PTAL

@AndyAyersMS
Copy link
Member

We only create r and ro build variants for these tests so we'll never see these asserts fire.

Please use explicit throws to ensure that validation is always done no matter how the test is built.

@fiigii
Copy link
Author

fiigii commented Mar 29, 2018

@AndyAyersMS Thanks for the suggestion. Fixed.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM save for the one typo.

int sizeOfoutArray = outArray.Length * Unsafe.SizeOf<TResult>();
if ((alignment != 32 && alignment != 16) || (alignment * 2) < sizeOfinArray1 || (alignment * 2) < sizeOfinArray2 || (alignment * 2) < sizeOfoutArray)
{
throw new ArgumentException("Invalid vlue of alignment");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "vlue" --> "value" (copied thoughout)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, will fix!

@fiigii fiigii reopened this Mar 29, 2018
@AndyAyersMS
Copy link
Member

Hmm, think jenkins lost it there for a minute. Let me close/repoen.

@AndyAyersMS AndyAyersMS reopened this Mar 29, 2018
@fiigii
Copy link
Author

fiigii commented Mar 30, 2018

test Windows_NT x86 Checked gcstress0xc_jitstress1
test Windows_NT x86 Checked gcstress0xc_jitstress2
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx

@AndyAyersMS
Copy link
Member

The "hit priveleged instruction" failure in x86 gc stress 0xC jitstress 1 is a symptom of a gc stress race that I'm trying to fix over in #17330.

The "SanityCheck()" failure in x86 gc stress 0xC jitstress 2 could be a number of things but it's unrelated to this change.

@fiigii
Copy link
Author

fiigii commented Mar 30, 2018

Can we merge this PR?

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CarolEidt CarolEidt merged commit 17de92e into dotnet:master Mar 30, 2018
@fiigii fiigii deleted the fixgctests branch August 29, 2018 00:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HardwareIntrinsics_X86_Avx_ConvertToVector_ro fails deterministically under gcstress

6 participants