Skip to content

x64: Break more data dependencies in float-related instructions#7818

Merged
fitzgen merged 3 commits intobytecodealliance:mainfrom
alexcrichton:fix-false-dependencies
Jan 26, 2024
Merged

x64: Break more data dependencies in float-related instructions#7818
fitzgen merged 3 commits intobytecodealliance:mainfrom
alexcrichton:fix-false-dependencies

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

This commit takes a stab at #7816 without diving a whole lot into it. I noticed that the loop started with vcvtss2sd which is along the same lines as previous false dependencies found earlier in PRs such as #7098. I had forgotten these instructions at the time and meant to go back and touch them up and #7731 has provided sufficient motivation to do so!

Locally this takes that test case from 1.6s to 0.4s for me.

This commit takes a stab at bytecodealliance#7816 without diving a whole lot into it. I
noticed that the loop started with `vcvtss2sd` which is along the same
lines as previous false dependencies found earlier in PRs such as bytecodealliance#7098.
I had forgotten these instructions at the time and meant to go back and
touch them up and bytecodealliance#7731 has provided sufficient motivation to do so!

Locally this takes that test case from 1.6s to 0.4s for me.
@alexcrichton alexcrichton requested a review from a team as a code owner January 24, 2024 23:36
@alexcrichton alexcrichton requested review from abrown and removed request for a team January 24, 2024 23:36
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen winch Winch issues or pull requests labels Jan 25, 2024
@github-actions
Copy link
Copy Markdown

Subscribe to Label Action

cc @saulecabrera

Details This issue or pull request has been labeled: "cranelift", "cranelift:area:x64", "winch"

Thus the following users have been cc'd because of the following labels:

  • saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Copy Markdown
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!! Out of curiosity, how did you end up root causing that perf bug to this false dependency?

@fitzgen fitzgen added this pull request to the merge queue Jan 25, 2024
Merged via the queue into bytecodealliance:main with commit b368240 Jan 26, 2024
@alexcrichton alexcrichton deleted the fix-false-dependencies branch January 26, 2024 16:08
@alexcrichton
Copy link
Copy Markdown
Member Author

Ah it was mostly from previous experience. I knew there were a set of instructions in the back of my mind which we still did the "fake the output register as the input" for AVX (e.g. the instructions modified here) and when I ran perf over the program the first very hot instruction in a loop was vcvtss2sd which I remembered was one of those. To test out I split the dependencies and then the performance improved so I assumed it was the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator winch Winch issues or pull requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants