Skip to content

client().shutdown() gives errors with KeyError: None and OSError: [WinError 10038] on the console #364

@natsrm

Description

@natsrm

Dear @minrk and all, I've been pulling my hair for these past few days trying to understand what's wrong with my codes and how to fix it, so I hope you can help me!

Just so can you understand my "journey": I'm using ipyparallel together with bluepyopt to do a morphospace exploration, and the script calls for matlab to run a function with given values that fluctuate between a range. When I run my code with parallelization (which I do manually with ipcontroller and ipengine), all engines can finish except one, which stays "frozen" in matlab for hours (longest time I have left it running was around 8h, while all other engines finish in less than 1h). However, if I run the codes by calling the matlab function everytime with the same values, all engines complete in less than 1h, and in the end of the code I have a shutdown request by the client that gives an error. So I decided to go back to basics and see if this was an overall problem with ipyparallel.
I just did a script that starts a controller with one engine, starts a client, prints the id, and then does a shutdown request.
The code is as follow:

import os
import ipyparallel as ipp
from time import sleep
os.system('''START "controller" ipcontroller --log-to-file &''')
sleep(15)
tc = ipp.Client()
os.system('''START "engine 0" ipengine &''')
sleep(30)
tc = ipp.Client()
print(tc)
print(tc.ids)
sleep(10)
tc.shutdown(hub = True)

The console output:

<ipyparallel.client.client.Client object at 0x000001CE4D8714E0>
[0]

The controller window, right before shuting down, shows:

Unhandled message type: 'shutdown_notification'
ERROR:scheduler:Unhandled message type: 'shutdown_notification'

The error I see in logfile of the controller:

ERROR | Uncaught exception in <function Hub.dispatch_monitor_traffic at 0x000001F9A099C6A8>
Traceback (most recent call last):
File "C:\Users\Natii\Anaconda3\lib\site-packages\ipyparallel\util.py", line 119, in log_errors
return f(self, *args, **kwargs)
File "C:\Users\Natii\Anaconda3\lib\site-packages\ipyparallel\controller\hub.py", line 520, in dispatch_monitor_traffic
handler(idents, msg)
File "C:\Users\Natii\Anaconda3\lib\site-packages\ipyparallel\controller\hub.py", line 846, in monitor_iopub_message
uuid = self.engines[eid].uuid
KeyError: None
client::client b'\x00\x80\x00\x00-' requested 'shutdown_request'
hub::hub shutting down.

I also sometimes (yes, only sometimes, don't ask me why...) get this output on the console:
(in the first exception, the value of the handle, 2204, changes every time, and in the second exception, the number of the thread, Thread-16, also changes)

Exception in callback BaseAsyncIOLoop._handle_events(2204, 1)
handle: <Handle BaseAsyncIOLoop._handle_events(2204, 1)>
Traceback (most recent call last):
File "C:\Users\Natii\Anaconda3\lib\asyncio\events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "C:\Users\Natii\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 122, in _handle_events
handler_func(fileobj, events)
File "C:\Users\Natii\Anaconda3\lib\site-packages\tornado\stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "C:\Users\Natii\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 465, in _handle_events
self._rebuild_io_state()
File "C:\Users\Natii\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 522, in _rebuild_io_state
self._update_handler(state)
File "C:\Users\Natii\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 539, in _update_handler
if state & self.socket.events:
File "C:\Users\Natii\Anaconda3\lib\site-packages\zmq\sugar\attrsettr.py", line 48, in __getattr__
return self._get_attr_opt(upper_key, opt)
File "C:\Users\Natii\Anaconda3\lib\site-packages\zmq\sugar\attrsettr.py", line 52, in _get_attr_opt
return self.get(opt)
File "zmq\backend\cython\socket.pyx", line 477, in zmq.backend.cython.socket.Socket.get
File "zmq\backend\cython\socket.pyx", line 135, in zmq.backend.cython.socket._check_closed
zmq.error.ZMQError: Unknown error
Exception in thread Thread-16:
Traceback (most recent call last):
File "C:\Users\Natii\Anaconda3\lib\threading.py", line 917, in _bootstrap_inner
self.run()
File "C:\Users\Natii\Anaconda3\lib\threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Natii\Anaconda3\lib\site-packages\ipyparallel\client\client.py", line 901, in _io_main
self._io_loop.start()
File "C:\Users\Natii\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 132, in start
self.asyncio_loop.run_forever()
File "C:\Users\Natii\Anaconda3\lib\asyncio\base_events.py", line 528, in run_forever
self._run_once()
File "C:\Users\Natii\Anaconda3\lib\asyncio\base_events.py", line 1728, in _run_once
event_list = self._selector.select(timeout)
File "C:\Users\Natii\Anaconda3\lib\selectors.py", line 323, in select
r, w, _ = self._select(self._readers, self._writers, [], timeout)
File "C:\Users\Natii\Anaconda3\lib\selectors.py", line 314, in _select
r, w, x = select.select(r, w, w, timeout)
OSError: [WinError 10038] An operation was attempted on something that is not a socket

So long story short, there is some error associated with the client().shutdown(), that happens both for hub = True or hub = False. The error in the console, happens only when hub = True. I don't know if this is why my original code is not finishing...

I am using Windows 10 64-bit, pyhton version 3.7.1.final.0 on conda version 4.6.12.
I would really appreciate any help you could give me!!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions