Remove sort in init EB and increase parallelism#644
Merged
marchdf merged 16 commits intodevelopmentfrom Jul 14, 2023
Merged
Conversation
Contributor
Author
|
Ok good news and bad news. First the good news. Still no diffs with MPI or GPU. On GPU, with the challenge problem, we significantly improved the development branch: So yay. The bad news. The development branch: I think what's going on is that I have a bunch of duplicate faces and I am not catching those. |
Contributor
Author
|
Ok I think I figured it out. On the challenge problem, the apply face stencil goes back to being: which is basically the same as development. which is a 160X speedup for that function. |
6e87941 to
730ade4
Compare
Contributor
Author
|
I am getting unexpected diffs on GPU with the challenge problem. I don't see them on CPU on my laptop... |
jrood-nrel
approved these changes
Jul 13, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR increases the parallelism in our definition of the EB structs, also it removes all the sort calls.
EB-C1 on 1 CPU rank and no openmp shows no diffs.
Things left to do: