Skip to content

Conversation

@thinkbeforecoding
Copy link
Contributor

This is an implementation of #13098
It implements a branchless compare:
cgt x y - clt x y
When x > y: 1 - 0 = 1
When x < y: 0 - 1 = -1
When x = y: 0 - 0 = 0

Benchmarks show that it is very slightly slower (10%) when predictions are always correct, but it is 3x faster on random values where prediction fail more often

The code emitted by the JIT is the same size.

@thinkbeforecoding
Copy link
Contributor Author

Some IL tests are failing due to the change.

There are some build errors on my machine (it seems to be new F# features like [x..] indexing... How can I build this locally ?

@vzarytovskii
Copy link
Member

Some IL tests are failing due to the change.

There are some build errors on my machine (it seems to be new F# features like [x..] indexing... How can I build this locally ?

Build scripts (from the devguide) should work just fine. You can use it with noVisualStudio switch.

Devguide also describes how to update IL baselines (if test uses baseline files).

@dsyme
Copy link
Contributor

dsyme commented May 24, 2022

I'm curious if we know that this is the fastest implementation. Is there any information on this on the web e.g. for C++ or assembly code?

@dsyme
Copy link
Contributor

dsyme commented May 24, 2022

I'd like to see a systematic correctness test matrix for comparisons on all the basic types affected by this, including

  • MaxValue
  • MaxValue - 1
  • MinValue
  • MinValue + 1
  • 0
  • 1
  • -1

Also don't forget the char and bool types

@thinkbeforecoding
Copy link
Contributor Author

There could be some improvements in the jit.
For instance, the cmp is done twice.
Since setgt setlt don't modify flags, both could be done with a single cmp.
And the double zero extension could be done after the sub as a sign extension.

@thinkbeforecoding
Copy link
Contributor Author

I think the fastest version would be in x64
cmp eax ecx
setg al
setl dl
sub dl, al
movsx eax, dl

There is a 4 instr version found on SO

sub %1, %0
jno 1f
cmc
rcr %0
https://stackoverflow.com/questions/10996418/efficient-integer-compare-function

But it involve a conditional jump that will be slower on branch prediction. Moreover I see no way to make the jit compile something like this.

@thinkbeforecoding
Copy link
Contributor Author

I was also looking at movcc (conditional move) instructions, but the source is a register or memory, it cannot be a constant. So it requires extra instructions to load constants and won't we shorter.
That could be interesting to implement min and max for instance.

@thinkbeforecoding
Copy link
Contributor Author

After a check, the requirement on IComparable.CompareTo is:

  • when x > y, result is > 0
  • when x < y, result is < 0
  • when x=y result is 0

for shorter types (byte, short, char) it is implemented as:

int x - int y

This is valid since there is no risk of overflow. But the result is not necessarily 0,1,-1

For longer types, it uses conditional jumps and returns always 0,1,-1

Do we have that constraint on the compare function ?

@thinkbeforecoding
Copy link
Contributor Author

@thinkbeforecoding
Copy link
Contributor Author

This would not be a breaking change at the specification level, but it would be a breaking change at the implementation level...

code like:
if compare x y = 1 then ... would be broken.

Same thing for code like:
ls |> List.sum (fun (x,y) -> compare x y)

But such code is not following the specification of compare.

@@ -0,0 +1,381 @@
namespace FSharp.Compiler.UnitTests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine in this PR, but for future, we suggest creating new tests in FSharp.Compiler.ComponentTests, it has nicer APIs and better support for IL baseline tests (instead of having an inline IL, it can be a separate file, which does not require recompiling the suite in case of change, and it is easier to re-generate using the env variable).

@thinkbeforecoding
Copy link
Contributor Author

There are a few tests that compare generated IL to expected IL that fail due to new code generation.

How do you usually regenerate expected IL ?

@vzarytovskii
Copy link
Member

vzarytovskii commented May 25, 2022

There are a few tests that compare generated IL to expected IL that fail due to new code generation.

How do you usually regenerate expected IL ?

So, if they are inline (i.e. the IL itself is the string in the test case itself, then, unfortunately, only one-by-one).
If they're verified as part of DirectoryAttribute-style testing, then, an ENV variable can be set, and then run tests locally:|

// on Windows
set TEST_UPDATE_BSL=1
// or, on macOS/Linux
export TEST_UPDATE_BSL=1
// and then
build.cmd -testCoreClr
// or
build.sh --testcoreclr

(https://github.com/dotnet/fsharp/blob/main/DEVGUIDE.md#updating-baselines-in-tests)

@thinkbeforecoding
Copy link
Contributor Author

There are a lot of culture dependent tests that fail on my french machine . Fixed it.
There was also a lot of over specific autogenerated tests for compare. They were checking for the exact response of compare... The check must be done on the sign, not the exact value. Fixed it.

@thinkbeforecoding
Copy link
Contributor Author

Some of the builds fail but it doesn't seem related to the change.. Any idea?

@vzarytovskii
Copy link
Member

@thinkbeforecoding
Copy link
Contributor Author

These are baseline tests, but they don't seem to be updated with the instructions above...

@vzarytovskii
Copy link
Member

I've updated them + updated the guide, If I may ask, which shell were you using?

@vzarytovskii
Copy link
Member

Great, now it's even more failures...I will take a look.

@thinkbeforecoding
Copy link
Contributor Author

I was using powershell so I used
$env:TEST_UPDATE_BSL=1

@vzarytovskii
Copy link
Member

I was using powershell so I used
$env:TEST_UPDATE_BSL=1

Hm, interesting. I will probably add a separate switch to the build scripts which runs tests and updates baselines.

@vzarytovskii
Copy link
Member

@thinkbeforecoding I have updated 2 remaining baselines for net472 framework + updated devguide describing how it works.
To update them successfully, It requires some knowledge of how IL baselines work. I will add a separate command to build scripts for updating those (#13204)

The rest of the tests which are failing are checking the compare result.

@KevinRansom
Copy link
Contributor

@thinkbeforecoding I have updated 2 remaining baselines for net472 framework + updated devguide describing how it works. To update them successfully, It requires some knowledge of how IL baselines work. I will add a separate command to build scripts for updating those (#13204)

The rest of the tests which are failing are checking the compare result.

computers are harsh:
System.Exception : args({ Item = 0us }, { Item = 65535us }) Expected=-1. Received=-65535.
0 - 65535 is indeed -65535 and not the -1 the program was expecting.

@KevinRansom
Copy link
Contributor

Since compare fallbacks to ICompare.Compare for other types, there is no guarantee that the output is always 0,1,-1.

I don't know about the no guarantee thing. If someone upgrades their dotnet sdk recompiles their app, and their original source code now has a "hard to analyze bug" pretty sure we would revert the change whilst looking red faced when saying yes we knew it could break people, but we thought it was rare enough to risk. And amusingly it was even caught by tests, which is an eye opener.

@thinkbeforecoding
Copy link
Contributor Author

Actually, byte.CompareTo or char.CompareTo return a different result from F# compare. Some types return only 0,-1,1 others return a large range of values.

Yet I can understand the impact at large scale.
However, the Bcl decided to randomize GetHashCode which was a similar breaking change. The spec has always specified not to rely on it value, but it probably happened... Was there bad feedback?

Would it be possible to add an analyser or a warning for comparison of the result of compare to anything other than 0?

@thinkbeforecoding
Copy link
Contributor Author

For now, progress is hindered by #13212 Running test suites locally is problematic on non en-US machines.

@vzarytovskii vzarytovskii reopened this Jun 7, 2022
@thinkbeforecoding
Copy link
Contributor Author

When I used the command it said:
Commenter does not have sufficient privileges for PR 13187 in repo dotnet/fsharp

@thinkbeforecoding
Copy link
Contributor Author

There are still some cancelled builds... strange

@thinkbeforecoding
Copy link
Contributor Author

These canceled builds again.

Copy link
Contributor

@KevinRansom KevinRansom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good. Thanks

@vzarytovskii
Copy link
Member

These canceled builds again.

Yeah, some infra hiccups, I suppose.

@thinkbeforecoding
Copy link
Contributor Author

@dsyme , I think this just needs your review, but should be good.

@KevinRansom
Copy link
Contributor

@dsyme, this is good to go, do you want to review your change request.

@dsyme dsyme merged commit a65ace7 into dotnet:main Jun 14, 2022
@dsyme
Copy link
Contributor

dsyme commented Jun 14, 2022

@thinkbeforecoding Thank you so much for this improvement!

@forki
Copy link
Contributor

forki commented Jun 14, 2022

Wohooo! It landed. Awesome work!

@EgorBo
Copy link
Member

EgorBo commented Feb 18, 2024

NOTE: as of .NET 7.0 (or was it 8.0), JIT is expected to use branchless instructions even when F#/C# emits branches (recognizes cmov like idioms)

@thinkbeforecoding
Copy link
Contributor Author

Yes I've noted this in some tests in fasmi. I think it's in dotnet 8.0

@EgorBo
Copy link
Member

EgorBo commented Feb 18, 2024

Sure, I just wanted to note that it's better when some sub-optimal codegen is filed against JIT rather than silently (for JIT team) fixed 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Perf improvement for primitive comparison

6 participants