In order to better integrate async I/O, we are going to make it responsible for driving the scheduler event loop. I think we should take this opportunity to rewrite the scheduler in Rust, which takes us a long way toward rewriting the entire runtime in Rust and enabling freestanding Rust.
Links
rt module: https://github.com/brson/rust/blob/io/src/libcore/rt/mod.rs
io module: https://github.com/brson/rust/blob/io/src/libcore/rt/io/mod.rs
sched module: https://github.com/brson/rust/blob/io/src/libcore/rt/sched.rs
Current work items
Scheduler
I/O related
Runtime porting
Constraints
- I/O is provided per-scheduler (per-thread) by libuv and schedulers are driven by the event loop.
- I/O handles are tied to the event loop (and therefore thread/scheduler) on which they were created. They have 'scheduler affinity'. Because tasks may migrate threads and I/O handles may move between tasks, before every I/O operation a task must arrange to be running on the correct scheduler.
- We must be able to allocate an entire scheduler (thread) exclusively to a single task (called "1:1 scheduling"). This is important for making blocking calls and for getting better, more predictable scheduling by relying on the OS.
- We want to try a work stealing scheduling strategy. Under work stealing, tasks are scheduled greedily, as soon as they are available to run, on the thread that is trying to schedule them. Descheduled tasks are 'stolen' and executed by other threads. Work stealing appears to be widely accepted as a simple but well-performing way to schedule green threads. Should be good for data locality and require minimal locking.
Concepts
Task - The current state of a task. An owned type that is passed between schedulers and other runtime objects. It mostly provides task-local runtime services that are exposed through the language and core API's, e.g. local heap, logging , GC, unwinding, TLS. Unlike the previous scheduler, Task's do not need to be scheduled as coroutines.
Scheduler - Schedulers provide the mechanism for scheduling and descheduling Tasks within a single thread. The runtime may consist of multiple threads, each one with a Scheduler. A Scheduler instance is (almost) always accessible via thread-local storage.
Coroutine - The stored CPU context and the stack of a single task, executed by Schedulers. Not sure this is the best name yet. Would also like to use this for task-local coroutines that have no task state of their own.
SchedHandle - A handle for communicating to remote Schedulers. For task forwarding, probably also to wake up sleeping schedulers when new work becomes available.
- Task pinning - A pinned task is one that must always run on a specific scheduler. A task that is pinned contains a SchedHandle which, when scheduling, must be used to first send the task to the scheduler it is pinned to. This is for creating 1:1 relationships between tasks and schedulers, also for pinning to the platform thread.
- Task forwarding - Again, for 1:1 task/schedulers. Schedulers can be configured to only run tasks which are pinned to them. In these cases, the Scheduler contains a SchedHandle to some other Scheduler, and when scheduling a task which is not pinned it must send the task elsewhere.
- I/O scheduler affinity - I/O objects that are tied to a scheduler must similarly contain a SchedHandle and arrange to run on the correct thread before executing.
- Sleeping - Schedulers that cannot obtain a task to schedule goes to sleep. A sleeping scheduler must be woken by other schedulers to resume scheduling tasks.
Scheduler multithreading
Most of the thread synchronization in Rust should be performed in Rust tasks, using pipes. The Scheduler itself should do as little as it can.
Tasks are owned types. When they are running their ~Task pointer is stored in thread local storage. When they are not running they are owned by whatever object they are blocked on. Scheduling a task means acquiring ownership of a ~Task, then passing ownership to a Scheduler. Blocking a task means taking ownership of a ~Task from thread-local storage, stuffing it somewhere else and context switching.
There are several places where schedulers need to synchronize with the outside world.
WorkQueue - work stealing deque
MessageQueue - message passing between schedulers
WorkList - The shared list of WorkQueues of active schedulers and SchedHandles of sleeping schedulers.
SleeperList - The shared collection (currently a stack) of sleeping schedulers.
The first is the WorkQueue, which is a work-stealing deque. Tasks can be pushed onto or popped from the front of the queue by a single thread. Tasks can be popped from the back of the queue (stolen) by an arbitrary number of threads. The WorkQueue is the primary mechanism for distributing tasks across threads. Importantly, under the work stealing strategy, most of the burden of distributing tasks to threads is put on those threads that otherwise are not working. The local scheduler just drops work into the (lock-free) queue and goes about its business.
The second is the MessageQueue through which tasks are forwarded from other schedulers. Tasks can be popped from the front of the queue by a single thread (the local Scheduler). Tasks can be pushed onto the back of the queue by an arbitrary number of threads. Sends to the MessageQueue are performed with the SchedHandle and are accompanied by a signal to wake up the Scheduler if it is asleep.
The MessageQueue is higher priority than the WorkQueue, as the former contains tasks that must specifically be acted on by the associated Scheduler, while the later contains work that can be stolen by other threads. An example of where messages should probably be checked is before spawning - some other thread can create the task, but no other thread can handle the message.
The SchedHandle also serves as a reference count keeping schedulers running. Schedulers exit when there are no remaining I/O events, no available tasks to run, and no outstanding SchedHandles.
Work stealing
See #3095 (comment)
TODO
- Signalling and waking up schedulers
How do sleeping schedulers become aware that there is new work to be stolen? This needs to impose minimum synchronization on the non-sleeping schedulers. Papers seem to imply busy-waiting.
Risks
- The combination of task pinning and I/O affinity could lead to tricky situations where a task is pinned to one scheduler but has I/O affinity for a different scheduler.
- I still don't have a solution for
select-like behavior.
Scheduler policy and operations
TODO
- Impact of I/O
- Work stealing vs. work sharing
- 1:1 scheduling
Alternatives
- If I/O handles were not sendable then the logic for keeping I/O-bound tasks on the right thread would be simpler and less error-prone.
- It may be possible to not implement task pinning if we can successfully implement tasks as threads, entirely without the scheduler, and if we can live without having an event loop on the 'platform thread' (the C main thread). This wouldn't remove the need for the message queue though, because of I/O.
In order to better integrate async I/O, we are going to make it responsible for driving the scheduler event loop. I think we should take this opportunity to rewrite the scheduler in Rust, which takes us a long way toward rewriting the entire runtime in Rust and enabling freestanding Rust.
Links
rtmodule: https://github.com/brson/rust/blob/io/src/libcore/rt/mod.rsiomodule: https://github.com/brson/rust/blob/io/src/libcore/rt/io/mod.rsschedmodule: https://github.com/brson/rust/blob/io/src/libcore/rt/sched.rsOption: http://www.reddit.com/r/rust/comments/1fod5c/io_and_condition_handlers/Current work items
Scheduler
I/O related
Runtime porting
unkillableetc.Constraints
Concepts
Task- The current state of a task. An owned type that is passed between schedulers and other runtime objects. It mostly provides task-local runtime services that are exposed through the language and core API's, e.g. local heap, logging , GC, unwinding, TLS. Unlike the previous scheduler, Task's do not need to be scheduled as coroutines.Scheduler- Schedulers provide the mechanism for scheduling and descheduling Tasks within a single thread. The runtime may consist of multiple threads, each one with a Scheduler. A Scheduler instance is (almost) always accessible via thread-local storage.Coroutine- The stored CPU context and the stack of a single task, executed by Schedulers. Not sure this is the best name yet. Would also like to use this for task-local coroutines that have no task state of their own.SchedHandle- A handle for communicating to remote Schedulers. For task forwarding, probably also to wake up sleeping schedulers when new work becomes available.Scheduler multithreading
Most of the thread synchronization in Rust should be performed in Rust tasks, using pipes. The Scheduler itself should do as little as it can.
Tasks are owned types. When they are running their
~Taskpointer is stored in thread local storage. When they are not running they are owned by whatever object they are blocked on. Scheduling a task means acquiring ownership of a~Task, then passing ownership to aScheduler. Blocking a task means taking ownership of a~Taskfrom thread-local storage, stuffing it somewhere else and context switching.There are several places where schedulers need to synchronize with the outside world.
WorkQueue- work stealing dequeMessageQueue- message passing between schedulersWorkList- The shared list ofWorkQueues of active schedulers andSchedHandles of sleeping schedulers.SleeperList- The shared collection (currently a stack) of sleeping schedulers.The first is the
WorkQueue, which is a work-stealing deque. Tasks can be pushed onto or popped from the front of the queue by a single thread. Tasks can be popped from the back of the queue (stolen) by an arbitrary number of threads. The WorkQueue is the primary mechanism for distributing tasks across threads. Importantly, under the work stealing strategy, most of the burden of distributing tasks to threads is put on those threads that otherwise are not working. The local scheduler just drops work into the (lock-free) queue and goes about its business.The second is the
MessageQueuethrough which tasks are forwarded from other schedulers. Tasks can be popped from the front of the queue by a single thread (the local Scheduler). Tasks can be pushed onto the back of the queue by an arbitrary number of threads. Sends to the MessageQueue are performed with the SchedHandle and are accompanied by a signal to wake up the Scheduler if it is asleep.The MessageQueue is higher priority than the WorkQueue, as the former contains tasks that must specifically be acted on by the associated Scheduler, while the later contains work that can be stolen by other threads. An example of where messages should probably be checked is before spawning - some other thread can create the task, but no other thread can handle the message.
The SchedHandle also serves as a reference count keeping schedulers running. Schedulers exit when there are no remaining I/O events, no available tasks to run, and no outstanding SchedHandles.
Work stealing
See #3095 (comment)
TODO
How do sleeping schedulers become aware that there is new work to be stolen? This needs to impose minimum synchronization on the non-sleeping schedulers. Papers seem to imply busy-waiting.
Risks
select-like behavior.Scheduler policy and operations
TODO
Alternatives