Skip to content

Python grpcio-1.14.1 Segmentation Faults caused by Channel.close() when used with connectivity-state subscriptions #16290

@e-heller

Description

@e-heller

Questions

  • What version of gRPC and what language are you using?

    Language: Python 2.7.15, 3.6.6

    gRPC: grpcio-1.14.1     (same issue in 1.14.0, 1.13.1, 1.13.0)

  • What operating system and version?

    Linux: Fedora 28, Ubuntu 16.04

  • What runtime / compiler are you using

    Python 2.7.15 and 3.6.6, using the grpcio package straight from pypi.org

What did you do?

We upgraded from gRPC 1.11.1 to 1.14.1 and tried to use the new Channel.close() method.

Now gRPC is causing Segmentation Faults.

This seems to be correlated to an interaction between subscriptions to Channel
connectivity-state and the Channel's close() method.

This is issue is not present in grpc 1.11.1 (there is no close() method after all).

As fas as I understand, this is the correct procedure:

  1. Create a channel
  2. Add a subscription callback
  3. Invoke some RPCs
  4. Unsubscribe the callback
  5. Close the channel

Yet even doing the correct thing, we are getting occasional segfaults.

Worse yet, if your code misses a step or gets anything out of order -- you don't
get an Exception -- you get segfaults. For example:

  1. Call subscribe() after the channel is closed --> segfault every time
  2. Call unsubscribe() after the channel is closed --> frequent segfaults
  3. Forget to unsubscribe the callback and close the channel --> frequent segfaults

What did you expect to see?

  • In all cases, I expect gRPC will not segfault

  • In the case where all events are done with correct procedure, I expect no
    Exceptions, no noisy and misleading log messages.

  • If subscribe or unsubscribe are called after the channel is closed,
    I would expect to get a python Exception that I can handle.

  • If unsubscribe is not called before the channel is closed, I expect to
    see no error, or at most a logged error. gRPC should close the channel and
    discard the subscribed callback function.

What did you see instead?

  1. Segmentation faults
  2. Lots of noisy logged exceptions. The message accompanying these log errors
    is misleading as well.

Example code

These examples use the helloworld protos and run against the greeter_server.py
from the examples directory of this repo.

Example 1: call subscribe() after the channel is closed

This will immediately result in a segfault every time:

from __future__ import print_function
import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    channel = grpc.insecure_channel('localhost:50051')
    stub = helloworld_pb2_grpc.GreeterStub(channel)
    
    response = stub.SayHello(helloworld_pb2.HelloRequest(name='Stranger'))
    print('Response: {}'.format(response.message))
    
    channel.close()
    # Don't subscribe to a closed channel. This is no exception, it's a segfault!
    channel.subscribe(connectivity_callback, try_to_connect=True)

def connectivity_callback(state):
    pass  # no-op

if __name__ == '__main__':
    run()

Output: Example 1

  • 1.1
    $ python subscribe_after_close.py
    
    Response 0: Hello, Stranger!
    Segmentation fault
    

Example 2: call unsubscribe() after the channel is closed

This randomly (but frequently) results in a segfault:

from __future__ import print_function
import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    for count in range(50):
        run_once(count)

def run_once(count):
    channel = grpc.insecure_channel('localhost:50051')
    stub = helloworld_pb2_grpc.GreeterStub(channel)

    channel.subscribe(connectivity_callback, try_to_connect=True)

    response = stub.SayHello(helloworld_pb2.HelloRequest(name='Stranger'))
    print('Response {}: {}'.format(count, response.message))

    channel.close()                                         #  <-----
    # Yikes! Don't unsubscribe after closing the channel!
    channel.unsubscribe(connectivity_callback)              #  <-----

def connectivity_callback(state):
    pass  # no-op

if __name__ == '__main__':
    run()

Output: Example 2

  • 2.1

    $ python unsubscribe_after_close.py
    
    Response 0: Hello, Stranger!
    Response 1: Hello, Stranger!
    Segmentation fault
    
  • 2.2

    $ python unsubscribe_after_close.py
    
    Response 0: Hello, Stranger!
    Response 1: Hello, Stranger!
    Exception in thread Thread-1:
    Traceback (most recent call last):
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/eheller/pew_envs/grpc-seg14/lib/python2.7/site-packages/grpc/_channel.py", line 801, in _poll_connectivity
        time.time() + 0.2)
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 451, in grpc._cython.cygrpc.Channel.watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 336, in grpc._cython.cygrpc._watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 344, in grpc._cython.cygrpc._watch_connectivity_state
    ValueError: Cannot invoke RPC on closed channel!
    
    ****  Cut for brevity  ****
    
    Response 2: Hello, Stranger!
    Response 3: Hello, Stranger!
    Exception in thread Thread-8:
    Traceback (most recent call last):
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/eheller/pew_envs/grpc-seg14/lib/python2.7/site-packages/grpc/_channel.py", line 801, in _poll_connectivity
        time.time() + 0.2)
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 451, in grpc._cython.cygrpc.Channel.watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 336, in grpc._cython.cygrpc._watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 344, in grpc._cython.cygrpc._watch_connectivity_state
    ValueError: Cannot invoke RPC on closed channel!
    
    Response 4: Hello, Stranger!
    Segmentation fault
    

Example 3: forget to call unsubscribe() before closing the channel

This randomly (but frequently) results in a segfault:

from __future__ import print_function
import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    for count in range(50):
        run_once(count)

def run_once(count):
    channel = grpc.insecure_channel('localhost:50051')
    stub = helloworld_pb2_grpc.GreeterStub(channel)

    channel.subscribe(connectivity_callback, try_to_connect=True)

    response = stub.SayHello(helloworld_pb2.HelloRequest(name='Stranger'))
    print('Response {}: {}'.format(count, response.message))

    # Danger! Don't forget to unsubscribe your callback!
    ## channel.unsubscribe(connectivity_callback)           #  <-----
    channel.close()                                         #  <-----

def connectivity_callback(state):
    pass  # no-op

if __name__ == '__main__':
    run()

Output: Example 3

  • 3.1
    $ python forget_to_unsubscribe.py
    
    Response 0: Hello, Stranger!
    Response 1: Hello, Stranger!
    Exception in thread Thread-1:
    Traceback (most recent call last):
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/eheller/pew_envs/grpc-seg14/lib/python2.7/site-packages/grpc/_channel.py", line 801, in _poll_connectivity
        time.time() + 0.2)
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 451, in grpc._cython.cygrpc.Channel.watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 336, in grpc._cython.cygrpc._watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 344, in grpc._cython.cygrpc._watch_connectivity_state
    ValueError: Cannot invoke RPC on closed channel!
    
    ****  Cut for brevity  ****
    
    Response 4: Hello, Stranger!
    Exception in thread Thread-10:
    Traceback (most recent call last):
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/eheller/pew_envs/grpc-seg14/lib/python2.7/site-packages/grpc/_channel.py", line 801, in _poll_connectivity
        time.time() + 0.2)
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 451, in grpc._cython.cygrpc.Channel.watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 336, in grpc._cython.cygrpc._watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 344, in grpc._cython.cygrpc._watch_connectivity_state
    ValueError: Cannot invoke RPC on closed channel!
    
    Segmentation fault
    

Example 4: Do everything right and still get segfaults

As far as I know, this example does everything correctly.
It does not segfault very often, but it still happens. Maybe worth mentioning
that we have not observed this to segfault in Python 3.

from __future__ import print_function
import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    for count in range(50):
        run_once(count)

def run_once(count):
    # This is how it's supposed to work right?
    channel = grpc.insecure_channel('localhost:50051')
    stub = helloworld_pb2_grpc.GreeterStub(channel)

    channel.subscribe(connectivity_callback, try_to_connect=True)

    response = stub.SayHello(helloworld_pb2.HelloRequest(name='Stranger'))
    print('Response {}: {}'.format(count, response.message))

    channel.unsubscribe(connectivity_callback)
    channel.close()

def connectivity_callback(state):
    pass  # no-op

if __name__ == '__main__':
    run()

Output: Example 4

  • 4.1

    $ python the_right_way.py
    
    Response 0: Hello, Stranger!
    Response 1: Hello, Stranger!
    Response 2: Hello, Stranger!
    Response 3: Hello, Stranger!
    Segmentation fault
    
  • 4.2

    $ python the_right_way.py
    
    Response 0: Hello, Stranger!
    Response 1: Hello, Stranger!
    Response 2: Hello, Stranger!
    Response 3: Hello, Stranger!
    Exception in thread Thread-10:
    Traceback (most recent call last):
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/eheller/pew_envs/grpc-seg14/lib/python2.7/site-packages/grpc/_channel.py", line 801, in _poll_connectivity
        time.time() + 0.2)
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 451, in grpc._cython.cygrpc.Channel.watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 336, in grpc._cython.cygrpc._watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 344, in grpc._cython.cygrpc._watch_connectivity_state
    ValueError: Cannot invoke RPC on closed channel!
    Response 4: Hello, Stranger!
    
    Response 5: Hello, Stranger!
    Response 6: Hello, Stranger!
    Response 7: Hello, Stranger!
    Response 8: Hello, Stranger!
    Response 9: Hello, Stranger!
    Response 10: Hello, Stranger!
    Response 11: Hello, Stranger!
    Segmentation fault
    

Example 5: If Example 4 didn't segfault, try it with a bunch of threads!

If the "correct" code seems like its working, try running the same thing with
with a small threadpool.

from __future__ import print_function
from concurrent import futures
import grpc
import helloworld_pb2
import helloworld_pb2_grpc

def run():
    executor = futures.ThreadPoolExecutor(max_workers=8)
    calls = [executor.submit(run_once, count) for count in range(128)]
    futures.wait(calls)

def run_once(count):
    channel = grpc.insecure_channel('localhost:50051')
    stub = helloworld_pb2_grpc.GreeterStub(channel)

    channel.subscribe(connectivity_callback, try_to_connect=True)

    response = stub.SayHello(helloworld_pb2.HelloRequest(name='Stranger'))
    print('Response {}: {}'.format(count, response.message))

    channel.unsubscribe(connectivity_callback)
    channel.close()

def connectivity_callback(state):
    pass  # no-op
    
if __name__ == '__main__':
    run()

Output: Example 5

  • 5.1

    $ python the_right_way_threaded.py 
    
    Response 0: Hello, Stranger!
    Response 2: Hello, Stranger!
    Response 1: Hello, Stranger!
    Response 4: Hello, Stranger!
    Response 3: Hello, Stranger!
    Response 5: Hello, Stranger!
    Segmentation fault
    
  • 5.2

    $ python the_right_way_threaded.py 
    
    Response 0: Hello, Stranger!
    Response 1: Hello, Stranger!
    Response 3: Hello, Stranger!Response 2: Hello, Stranger!
    
    Response 6: Hello, Stranger!
    Response 4: Hello, Stranger!
    Exception in thread Thread-3:
    Traceback (most recent call last):
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 801, in __bootstrap_inner
        self.run()
      File "/home/eheller/.pythonz/pythons/CPython-2.7.15/lib/python2.7/threading.py", line 754, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/home/eheller/pew_envs/grpc-seg14/lib/python2.7/site-packages/grpc/_channel.py", line 801, in _poll_connectivity
        time.time() + 0.2)
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 451, in grpc._cython.cygrpc.Channel.watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 336, in grpc._cython.cygrpc._watch_connectivity_state
      File "src/python/grpcio/grpc/_cython/_cygrpc/channel.pyx.pxi", line 344, in grpc._cython.cygrpc._watch_connectivity_state
    ValueError: Cannot invoke RPC on closed channel!
    
    Segmentation fault
    

Anything else we should know about your project / environment?

We are trying to upgrade from grpcio 1.11.1 to 1.14.x. These issues have
been a complete show-stopper for us.

I anticipate that only a small fraction of users make use of these Channel
connectivity-state subscriptions. However, they are quite important to our
implementation and our growing gRPC-based platform.

Thanks!

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions