-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Support for Power8/LE platform (ppc64le). #6317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Removed a typo at line 577.
|
In my opinion the changes look great, at least the code diff has a good smell to it. (Some aspects, like build system decisions, I'm not qualified to judge.) Why is the coroutine stack size increased from 128K to 256K on this platform? I'd like to add a comment telling people why. |
|
Another weird thing: The change defining V8_NEEDS_BUFFER_ALLOCATOR to 1 in the middle of js_job.cc. That should presumably be done in some build config setting. |
|
Another potential bug is that in |
|
Hi, In the context_switching.cc change it looks like the new code saves registers below the stack pointer. All the other architectures decrement the stack pointer and then save the registers. Here are some diagrams of what the stacks look like after the Existing architectures: Power: Is there a reason power needs to behave differently? If so then it would be really good to add some comments. Also I think there is a problem with removing the fake caller stack frame. If I understand the Power ELF V2 ABI correctly, then the |
|
Implemented V8_NEEDS_BUFFER_ALLOCATOR related change. Pondering over the context switching discussion. |
|
@srh About the potential bug in src/build.mk - present work used only GNU compiler hence added those GCC specific flags. Are there any specific compilers you would recommend for testing on ppc64le, apart from GNU's? |
|
The change there adds general linking flags. I'd expect other compilers to need the same flags. I don't expect you to test other compilers. If people see flags like that in a compiler-specific branch, they're likely to think the flags are GCC-only for some specific reason. And if Clang was used, you'd expect it to take the same flags. Another bug in build.mk is that A better way might be to make |
|
I have made change for V8_NEEDS_BUFFER_ALLOCATOR and the changes suggested in v8.sh, build.mk files. |
|
Marked the review comments which are implemented. |
src/config/args.hpp
Outdated
|
|
||
| #if defined (__powerpc64__) | ||
| // POWER ABI requires more number of registers to be saved | ||
| // in stack during context switch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make sense to me. The context switch requires 168 bytes of stack space. I don't see why that would mean the stack size needs to be doubled. What error do you see? Does is still occur with the revised context switching code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mundaym There are alloca() calls in glibc called by getifaddrs() which tries to
allocate 64KB memory in stack frame for powerpc. If I dont increase the stack
size, there are failures in test cases due to stack overflow checks in rethinkdb.
Hence increasing it for powerpc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this explanation be added as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AtnNn Added comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added code comments.
| sp -= 16; // r6-r13 and f8-f15. | ||
| #elif defined (__powerpc64__) | ||
| sp += 24; | ||
| sp -= 20; // r14-r31, toc, cr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking you also need to save f14-f31 and v20-v31 to meet the ABI requirements. The compiler could choose to save something in one of these registers and a coroutine switch might end up clobbering it. It's the sort of thing that is very hard to debug. At the very least there should be a comment here explaining the danger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mundaym Considering the factors performance and being light weight, I have not saved floating point and vector registers.There are no failures seen when non volatile floating point registers
are not saved and restored during context switch . I have tested on
ubuntu/RHEL powerpc64le systems. Also we can hit any bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a comment above that indicates that a context switch "doesn't swap the floating-point registers"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's safe to do on x86 because none of the SSE/floating point registers are callee-save (and so aren't 'live' on entry to the swapcontext function). On POWER and s390x however the ABIs have callee-save floating point registers and the compiler assumes they will be restored after a function call.
In practice the compiler probably doesn't use the callee-saved floating point registers very often in code like this, so not saving/restoring them might not cause any issues. But it is a bit fragile IMO. Worth a comment anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mundaym Added code comment.
|
Awesome work. Ideally, if we set Otherwise the v8-related changes look okay to me. They aren't uglier than what we already have. Someday we'll get around to do #5112 |
|
@AtnNn Thanks for adding this PR to 2.5 milestones. Any estimate on timeframe when 2.5 can be expected? Also, any plans to create a new branch for 2.5? |
|
@asowani The branch for 2.5 is |
|
@AtnNn Thanks for letting me know about the estimated time frame for 2.5 release. I don't know much about how much effort it generally takes for new release preparation (may be creating a new branch?), so please pardon if I sound too eager. But is there any way we can have 2.5 release a bit earlier? If there is any help required, I am willing to assist. Thanks! |
|
We're trying to get 2.4 ready right now. I'll try to get this patch into 2.4 if it doesn't delay the release. |
|
@AtnNn - is v2.4.x the correct branch to check for merged patches for upcoming 2.4 release? |
|
@asowani yes. |
|
@AtnNn I have setup a Jenkins job to test v2.4.x branch whenever ppc64le code gets merged. At the moment I have disabled the job as I don't see those changes in the branch, but it can start the moment the changes get merged in the branch. I guess I will get an automated mail from github when this PR is merged, but just in case, I would highly appreciate if you could intimate here once the code is merged. Thanks! |
|
@AtnNn -- what's your feeling on whether this will make it in to the 2.4 release or not? It still has the milestone, which gives me hope that it will. 😆 |
|
Just following up on the previous comment. Any outlook on whether this will be merging into 2.4 at this point? Thanks! |
|
I'd say it's likely to make it into 2.4. |
|
I've rebased and cleaned up the changes in this PR and put them in branches sam/24x/powerpc64 (based off v2.4.x) and sam/powerpc64 (based off next) in my repo at https://github.com/srh/rethinkdb/. There are two things I'd still like to understand before putting this into 2.4. First, what is the point of setting V8_NEEDS_BUFFER_ALLOCATOR? Second, why does getifaddrs use alloca on powerpc64, but not on amd64? Is there an implementation I could look at in order to see this? Edit: It looks like getifaddrs calls __netlink_request in glibc's sysdeps/unix/sysv/linux/ifaddrs.c, which then has after which, if buf_size <= 64K, it gets passed to alloca. And the page size could be 64KB. I don't want to throw in these mysterious configuration options without putting an explanation in the source code. |
|
There's a third thing. Why is v8 4.7 used for powerpc64? Is it necessary? |
|
I've updated the branches referenced above to compute the coroutine stack size based on the page size value. So there are now still two things I'd like to understand:
|
|
@srh The selection of v8 4.7.x was already done in the script but was limited to Windows. Older versions of v8 are not ported to ppc64le and v8 4.7 has been ported to and tested on ppc64le, and is found to be working good on it. V8_NEEDS_BUFFER_ALLOCATOR is required at build time to have 4.7 version of v8. |
|
|
@srh Any updates about code merge? Please let me know in case there are further queries. Thanks! |
|
But what is it about ppc64le that makes v8 4.7 need V8_BUFFER_ALLOCATOR? |
|
If I check out v8 4.7.80.23 and grep for V8_NEEDS_BUFFER_ALLOCATOR, I don't get any matches, and if I search on Google I don't get any results outside of this issue. |
|
Oh, I'm a retard. Will merge this when I get back to a PC. |
|
@srh Thanks very much for merging the changes! |
Yes.
Description
These code changes are done for:
Thanks!