Commit 4928775
committed
pubsub: use one thread to start/stop connections
Previously we submit a bunch of tasks which is unnecessary
and deadlock-prone.
This does not solve the one-thread-deadlock problem,
but it changes the symptom.
In my testing, Subscriber now successfully pulls
one batch of messages before failing due to
DEADLINE_EXCEEDED error.
This is obviously undesirable but at least it doesn't
deadlock anymore.
This is my best attempt at reconstructing the events:
1. A bunch of tasks get registered to [GetSubscription](https://github.com/GoogleCloudPlatform/google-cloud-java/blob/9623994f58199c79ee5b9f99ad0ff6d7fb69bd84/google-cloud-pubsub/src/main/java/com/google/cloud/pubsub/spi/v1/PollingSubscriberConnection.java#L105).
2. At least one of these tasks succeed and continue to
[pull messsages](https://github.com/GoogleCloudPlatform/google-cloud-java/blob/9623994f58199c79ee5b9f99ad0ff6d7fb69bd84/google-cloud-pubsub/src/main/java/com/google/cloud/pubsub/spi/v1/PollingSubscriberConnection.java#L132).
3. Something somewhere register a very long-running task
(~30sec on my machine) to the pool.
Stack trace (obtained by `jstack`) revealed that the pool is
busy doing something, but I don't know what it is yet.
4. A message is pulled and the task of printing it to the screen
is registered to the pool.
It cannot run yet because the long-running task is using the
only thread available.
5. At least one of tasks that called GetSubscription timed out.
6. Long running task timed out or completed.
7. Message finally printed to the screen.
8. Because time out on GetSubscription is not considered retryable,
the Subscriber fails. It only failed after 30sec because the
task to fail it is also blocked behind the long-running task.
Updates #1827 and #2041.1 parent 8293249 commit 4928775
1 file changed
Lines changed: 14 additions & 37 deletions
Lines changed: 14 additions & 37 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | | - | |
402 | | - | |
403 | | - | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
404 | 393 | | |
405 | | - | |
406 | | - | |
407 | | - | |
408 | | - | |
| 394 | + | |
| 395 | + | |
409 | 396 | | |
410 | 397 | | |
411 | 398 | | |
| |||
415 | 402 | | |
416 | 403 | | |
417 | 404 | | |
418 | | - | |
419 | | - | |
420 | | - | |
421 | | - | |
422 | | - | |
423 | | - | |
424 | | - | |
425 | | - | |
426 | | - | |
427 | | - | |
428 | | - | |
429 | | - | |
430 | | - | |
431 | | - | |
432 | | - | |
| 405 | + | |
| 406 | + | |
433 | 407 | | |
434 | | - | |
435 | | - | |
436 | | - | |
437 | | - | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
438 | 415 | | |
439 | 416 | | |
440 | 417 | | |
| |||
0 commit comments