-
Notifications
You must be signed in to change notification settings - Fork 332
Can't log from within tpool.execute #432
Description
If you try to log from within a function called by tpool.execute, there is a chance that the tpool thread never returns, and you see a stack trace like this one:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 458, in fire_timers
timer()
File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
cb(*args, **kw)
File "/usr/local/lib/python2.7/dist-packages/eventlet/semaphore.py", line 147, in _do_acquire
waiter.switch()
error: cannot switch to a different thread
This is because many logging handlers have mutexes that are threading._RLock objects, and Eventlet's replacement for thread.allocate_lock returns an eventlet.semaphore.Semaphore object, which does not work across different hubs in different pthreads.
Here's a small script to reproduce the issue:
#!/usr/bin/env python
#
# Demonstrates the crash with logging across pthreads
import eventlet.patcher
import eventlet.tpool
import logging
import random
import sys
import time
eventlet.patcher.monkey_patch()
logger = logging.getLogger("logger-test")
# This handler's .lock is a threading._RLock
logger.addHandler(logging.StreamHandler(sys.stdout))
logger.setLevel(logging.DEBUG)
def log_n_times(me, n):
for x in range(n):
logger.info("%s %d", me, x)
time.sleep(random.random() * 0.01)
logger.info("starting")
greenthread = eventlet.spawn(log_n_times, 'greenthread', 50)
eventlet.tpool.execute(log_n_times, 'pthread', 50)
greenthread.wait()
logger.info("done")
The bug was originally found in Openstack Swift, and there's a better writeup at https://bugs.launchpad.net/swift/+bug/1710328 and a commit at openstack/swift@6d16079 that fixes the problem, but only for Swift, and not in a general way.
I'd like to figure out how to fix this in general, but I'm not sure how to proceed. Using a pipe-based mutex for all _RLock objects would work, but would be a very expensive fix. Perhaps just the locks in logging handlers, because those are global?