Skip to content

Move back to a mostly-single-threaded http2 transport#8008

Merged
ctiller merged 158 commits intogrpc:masterfrom
ctiller:direct-calls
Oct 18, 2016
Merged

Move back to a mostly-single-threaded http2 transport#8008
ctiller merged 158 commits intogrpc:masterfrom
ctiller:direct-calls

Conversation

@ctiller
Copy link
Copy Markdown
Member

@ctiller ctiller commented Sep 7, 2016

Combiner locks (refactored here) end up providing most of the gains we had from a complicated chttp2 threading model: drop the complicated book-keeping, get simpler, faster code.

@yang-g - please take chttp2 (grab @markdroth into the review if you want to split it up)
@vjpai - please look at the new & improved combiner locks, and iomgr/{closure,exec_ctx} changes
@sreecha - please look at workqueue changes (particularly the merge of that into the polling engine)
@dgquintas - channel changes, src/core/lib/support, src/core/lib/profiling, src/core/lib/surface, src/core/lib/transport


This change is Reviewable

@ctiller
Copy link
Copy Markdown
Member Author

ctiller commented Oct 17, 2016

Unless there are violent objections: I plan to bombs-away later this afternoon assuming tests look good.

There have been some minor edits since LGTM's came in:
@yang-g may want a quick look at df0f365 936f1ea 1409dbf
@vjpai may want a quick look at c2dd2a2 b019d61
@rjshade may want a quick look at 68cf8ce

@kpayson64
Copy link
Copy Markdown
Contributor

kpayson64 commented Oct 18, 2016

About the Python failure:

Nothing wrong is jumping out to me with the test. Basically we have a call waiting on a server response, then call grpc_completion_queue_next() in a background thread, then cancel the call from the main thread. The test hangs waiting for grpc_completion_queue_next() to return.

This seems like it should not hang, and in order for grpc_completion_queue_next() to return, grpc_wakeup_fd_wakeup() would somehow need to be called. It doesn't look like that is the case:

kpayson@kpayson0:~/grpc$ gdb --args py27/bin/python2.7-dbg src/python/grpcio_tests/tests/unit/_cython/_read_some_but_not_all_responses_test.py
GNU gdb (GDB) 7.9-gg19
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-grtev4-linux-gnu".
Type "show configuration" for configuration details.

<http://go/gdb-home FAQ: http://go/gdb-faq Email: c-toolchain-team IRC: gdb>
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Reading symbols from py27/bin/python2.7-dbg...done.
Unable to determine compiler version.
Skipping loading of libstdc++ pretty-printers for now.
Non-google3 binary detected.
(gdb) handle SIG40 noprint  
Signal        Stop  Print   Pass to program Description
SIG40         No    No  Yes     Real-time event 40
(gdb) break grpc_wakeup_fd_wakeup
Function "grpc_wakeup_fd_wakeup" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (grpc_wakeup_fd_wakeup) pending.
(gdb) break grpc_call_cancel
Function "grpc_call_cancel" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (grpc_call_cancel) pending.
(gdb) run
Starting program: /usr/local/google/home/kpayson/grpc/py27/bin/python2.7-dbg src/python/grpcio_tests/tests/unit/_cython/_read_some_but_not_all_responses_test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/grte/v4/lib64/libthread_db.so.1".

Breakpoint 1, grpc_wakeup_fd_wakeup (fd_info=0x7ffff5e871c8 <polling_island_wakeup_fd>)
    at src/core/lib/iomgr/wakeup_fd_posix.c:87
87    if (cv_wakeup_fds_enabled) {
(gdb) c
Continuing.
testReadSomeButNotAllResponses (__main__.ReadSomeButNotAllResponsesTest) ... [New Thread 0x7ffff27a9700 (LWP 62210)]
[Thread 0x7ffff27a9700 (LWP 62210) exited]
[New Thread 0x7ffff2faa700 (LWP 62209)]
[New Thread 0x7ffff37ab700 (LWP 62208)]
[New Thread 0x7ffff3fac700 (LWP 62207)]

Breakpoint 1, grpc_wakeup_fd_wakeup (fd_info=0xb2a398) at src/core/lib/iomgr/wakeup_fd_posix.c:87
87    if (cv_wakeup_fds_enabled) {
(gdb) c
Continuing.

Breakpoint 1, grpc_wakeup_fd_wakeup (fd_info=0xb2a398) at src/core/lib/iomgr/wakeup_fd_posix.c:87
87    if (cv_wakeup_fds_enabled) {
(gdb) c
Continuing.

Breakpoint 1, grpc_wakeup_fd_wakeup (fd_info=0xb2a398) at src/core/lib/iomgr/wakeup_fd_posix.c:87
87    if (cv_wakeup_fds_enabled) {
(gdb) c
Continuing.
[Thread 0x7ffff2faa700 (LWP 62209) exited]

Breakpoint 1, grpc_wakeup_fd_wakeup (fd_info=0x7fffe4000f38) at src/core/lib/iomgr/wakeup_fd_posix.c:87
87    if (cv_wakeup_fds_enabled) {
(gdb) c
Continuing.

Breakpoint 2, grpc_call_cancel (call=0xd47350, reserved=0x0) at src/core/lib/surface/call.c:733
733   GRPC_API_TRACE("grpc_call_cancel(call=%p, reserved=%p)", 2, (call, reserved));
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
85  ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S: No such file or directory.
(gdb) 

@ctiller ctiller merged commit 1a62ec8 into grpc:master Oct 18, 2016
@ctiller
Copy link
Copy Markdown
Member Author

ctiller commented Oct 18, 2016

Manually verified alarming failed tests (and re-verified dbg)

@lock lock bot locked as resolved and limited conversation to collaborators Jan 26, 2019
@lock lock bot unassigned dgquintas and sreecha Jan 26, 2019
@thelinuxfoundation
Copy link
Copy Markdown

Thank you for your pull request. Before we can look at your contribution, we need to ensure all contributors are covered by a Contributor License Agreement.

After the following items are addressed, please respond with a new comment here, and the automated system will re-verify.

Regards,
CLA GitHub bot

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.