Skip to content

Commit 0825039

Browse files
committed
Auto merge of #11032 - lqd:priority_pending_queue, r=Eh2406
Take priority into account within the pending queue This is the PR for the work discussed in [this zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/pending.20queue.20scheduling.20experiments) and whose detailed description and some results are available [here](https://github.com/lqd/rustc-benchmarking-data/tree/main/experiments/cargo-schedules/pending-queue-sorted) with graphs, summaries and raw data -- much of which was shown in the thread as well. Units of works have a computed priority that is used in the dependency queue, so that higher priorities are dequeued sooner, as documented [here](https://github.com/rust-lang/cargo/blob/996a6363ce4b9109d4ca757407dd6dcb4805c86f/src/cargo/util/dependency_queue.rs#L34-L35). This PR further applies that principle to the next step before being executed: if multiple pieces of work are waiting in the pending queue, we can sort that according to their priorities. Here as well, higher priorities should be scheduled sooner. They are more often than not wider than pure chains of dependencies, and this should create more parallelism opportunities earlier in the pipeline: a high priority piece of work represents more future pieces of work down the line, and try to sustain CPU utilization longer (at the potential cost of this parallelism being distributed differently than today, between cargo invoking rustc and rustc's own codegen threads -- when applicable). This is a scheduling tradeoff that behaves differently for each project, machine configuration, amount of available parallelism at a given point in time, etc, but seems to help more often than hinders: at low-core counts and with enough units of work to be done, so that there is jobserver token contention where choosing a "better" piece of work to work on next may be possible. There's of course a bit of noise in the results linked above and 800 or so of the most popular crates.io crates is still a limited sample, but they're mostly meant to show a hopefully positive trend: while there are improvements and regressions, that trend looks to be more positive than negative, with the wins being more numerous and with higher amplitudes than the corresponding losses. (A report on another scheduling experiment -- a superset of this PR, where I also simulate users manually giving a higher priority to `syn`, `quote`, `serde_derive` -- [is available here](https://github.com/lqd/rustc-benchmarking-data/tree/main/experiments/cargo-schedules/pending-queue-prioritized) and also improves this PR's results: the regressions are decreased in number and amplitude, whereas the improvements are bigger and more numerous. So that could be further work to iterate upon this one) Since this mostly applies to clean builds, for low core counts, and with a sufficient number of dependencies to have some items in the pending queue, I feel this also applies well to CI use-cases (esp. on the free tiers). Somewhat reassuring as well, and discussed in the thread but not in the report: I've also tried to make sure cargo and bootstrapping rustc are not negatively affected. cargo saw some improvements, and bootstrap stayed within today's variance of +/- 2 to 3%. Similarly, since 3y old versions of some tokio crates (`0.2.0-alpha.1`) were the most negatively affected, I've also checked that recent tokio versions (`1.19`) are not disproportionately impacted: their simple readme example, the more idiomatic `mini-redis` sample, and some of my friends' tokio projects were either unaffected or saw some interesting improvements. And here's a `cargo check -j2` graph to liven up this wall of text: ![some results of `cargo check -j2`](https://github.com/lqd/rustc-benchmarking-data/raw/main/experiments/cargo-schedules/pending-queue-sorted/images/check-j2-sorted.png) --- I'm not a cargo expert so I'm not sure whether it would be preferable to integrate priorities deeper than just the dependency queue, and e.g. have `Unit`s contain a dedicated field or similar. So in the meantime I've done the simplest thing: just sort the pending queue and ask the units' priorities to the dep queue. We could just as well have the priority recorded as part of the pending queue tuples themselves, or have that be some kind of priority queue/max heap instead of a Vec. Let me know which you prefer, but it's in essence a very simple change as-is.
2 parents c633b27 + 4ffe98a commit 0825039

File tree

2 files changed

+29
-18
lines changed

2 files changed

+29
-18
lines changed

src/cargo/core/compiler/job_queue.rs

+15-4
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ struct DrainState<'cfg> {
162162
/// The list of jobs that we have not yet started executing, but have
163163
/// retrieved from the `queue`. We eagerly pull jobs off the main queue to
164164
/// allow us to request jobserver tokens pretty early.
165-
pending_queue: Vec<(Unit, Job)>,
165+
pending_queue: Vec<(Unit, Job, usize)>,
166166
print: DiagnosticPrinter<'cfg>,
167167

168168
/// How many jobs we've finished
@@ -576,8 +576,16 @@ impl<'cfg> DrainState<'cfg> {
576576
// possible that can run. Note that this is also the point where we
577577
// start requesting job tokens. Each job after the first needs to
578578
// request a token.
579-
while let Some((unit, job)) = self.queue.dequeue() {
580-
self.pending_queue.push((unit, job));
579+
while let Some((unit, job, priority)) = self.queue.dequeue() {
580+
// We want to keep the pieces of work in the `pending_queue` sorted
581+
// by their priorities, and insert the current job at its correctly
582+
// sorted position: following the lower priority jobs, and the ones
583+
// with the same priority (since they were dequeued before the
584+
// current one, we also keep that relation).
585+
let idx = self
586+
.pending_queue
587+
.partition_point(|&(_, _, p)| p <= priority);
588+
self.pending_queue.insert(idx, (unit, job, priority));
581589
if self.active.len() + self.pending_queue.len() > 1 {
582590
jobserver_helper.request_token();
583591
}
@@ -586,8 +594,11 @@ impl<'cfg> DrainState<'cfg> {
586594
// Now that we've learned of all possible work that we can execute
587595
// try to spawn it so long as we've got a jobserver token which says
588596
// we're able to perform some parallel work.
597+
// The `pending_queue` is sorted in ascending priority order, and we
598+
// remove items from its end to schedule the highest priority items
599+
// sooner.
589600
while self.has_extra_tokens() && !self.pending_queue.is_empty() {
590-
let (unit, job) = self.pending_queue.remove(0);
601+
let (unit, job, _) = self.pending_queue.pop().unwrap();
591602
*self.counts.get_mut(&unit.pkg.package_id()).unwrap() -= 1;
592603
if !cx.bcx.build_config.build_plan {
593604
// Print out some nice progress information.

src/cargo/util/dependency_queue.rs

+14-14
Original file line numberDiff line numberDiff line change
@@ -149,15 +149,15 @@ impl<N: Hash + Eq + Clone, E: Eq + Hash + Clone, V> DependencyQueue<N, E, V> {
149149
///
150150
/// A package is ready to be built when it has 0 un-built dependencies. If
151151
/// `None` is returned then no packages are ready to be built.
152-
pub fn dequeue(&mut self) -> Option<(N, V)> {
153-
let key = self
152+
pub fn dequeue(&mut self) -> Option<(N, V, usize)> {
153+
let (key, priority) = self
154154
.dep_map
155155
.iter()
156156
.filter(|(_, (deps, _))| deps.is_empty())
157-
.map(|(key, _)| key.clone())
158-
.max_by_key(|k| self.priority[k])?;
157+
.map(|(key, _)| (key.clone(), self.priority[key]))
158+
.max_by_key(|(_, priority)| *priority)?;
159159
let (_, data) = self.dep_map.remove(&key).unwrap();
160-
Some((key, data))
160+
Some((key, data, priority))
161161
}
162162

163163
/// Returns `true` if there are remaining packages to be built.
@@ -213,19 +213,19 @@ mod test {
213213
q.queue(5, (), vec![(4, ()), (3, ())], 1);
214214
q.queue_finished();
215215

216-
assert_eq!(q.dequeue(), Some((1, ())));
217-
assert_eq!(q.dequeue(), Some((3, ())));
216+
assert_eq!(q.dequeue(), Some((1, (), 5)));
217+
assert_eq!(q.dequeue(), Some((3, (), 4)));
218218
assert_eq!(q.dequeue(), None);
219219
q.finish(&3, &());
220220
assert_eq!(q.dequeue(), None);
221221
q.finish(&1, &());
222-
assert_eq!(q.dequeue(), Some((2, ())));
222+
assert_eq!(q.dequeue(), Some((2, (), 4)));
223223
assert_eq!(q.dequeue(), None);
224224
q.finish(&2, &());
225-
assert_eq!(q.dequeue(), Some((4, ())));
225+
assert_eq!(q.dequeue(), Some((4, (), 3)));
226226
assert_eq!(q.dequeue(), None);
227227
q.finish(&4, &());
228-
assert_eq!(q.dequeue(), Some((5, ())));
228+
assert_eq!(q.dequeue(), Some((5, (), 2)));
229229
}
230230

231231
#[test]
@@ -238,16 +238,16 @@ mod test {
238238
q.queue(4, (), vec![(2, ()), (3, ())], 1);
239239
q.queue_finished();
240240

241-
assert_eq!(q.dequeue(), Some((3, ())));
242-
assert_eq!(q.dequeue(), Some((1, ())));
241+
assert_eq!(q.dequeue(), Some((3, (), 9)));
242+
assert_eq!(q.dequeue(), Some((1, (), 4)));
243243
assert_eq!(q.dequeue(), None);
244244
q.finish(&3, &());
245245
assert_eq!(q.dequeue(), None);
246246
q.finish(&1, &());
247-
assert_eq!(q.dequeue(), Some((2, ())));
247+
assert_eq!(q.dequeue(), Some((2, (), 3)));
248248
assert_eq!(q.dequeue(), None);
249249
q.finish(&2, &());
250-
assert_eq!(q.dequeue(), Some((4, ())));
250+
assert_eq!(q.dequeue(), Some((4, (), 2)));
251251
assert_eq!(q.dequeue(), None);
252252
q.finish(&4, &());
253253
assert_eq!(q.dequeue(), None);

0 commit comments

Comments
 (0)