Conversation
rprouse
left a comment
There was a problem hiding this comment.
This looks good, but I would like to get a second review on this if anyone has time. The meat of the change is in WorkItemQueue, so that is where I would like to see a second review, especially around the dequeuing of work items.
|
I did a bit of refactoring in advance to enable the change I wanted to make, so it may be easier to understand by looking at each commit. |
|
@ChrisMaddock @jnm2 Can one of you give this a second review? |
|
It's on my 'todo' list - afraid this week is a busy one though! Aiming to look at this and #2337 at some point this week. 🙂 |
|
@ChrisMaddock That's cool. You weren't @mentioned and I'm never sure who sees things in that case. |
| while (work == null) | ||
| foreach (var q in _innerQueues) | ||
| if (q.TryDequeue(out work)) | ||
| break; |
There was a problem hiding this comment.
@CharliePoole I am fine with merging if @ChrisMaddock or @jnm2 don't have time for a review. All of the code looks good to me, but something about this is bothering me. It is a gut reaction and I can't put my finger on it.
My thinking is that if there is nothing in any of the queues, work will always be null and this go into a hard spin. The break will just break out of the foreach. Am I incorrect? Shouldn't this just be something like,
foreach (var q in _innerQueues)
if (q.TryDequeue(out work))
break;
if(work == null)
continue;Which will allow another try at the Spin/Wait code above?
Or am I misunderstanding the Spin/Wait code above and it can never get here if there is nothing in any of the queues? If so, why do we need the while?
This is what concerns me about the lockless code that was added, it is very hard to understand all of the moving parts when you haven't looked at it in awhile.
There was a problem hiding this comment.
I don't believe so. The old code also had a loop using TryDequeue(), so it kept looping until something was found in the queue...
while (!_innerQueue.TryDequeue(out work){ }In this case, we want to try two queues. I could have written...
while (!_innerQueues[0].TryDequeue(out work) && !_innerQueues[1](out work){ }However, I wanted to keep the number of priorities open, at least in theory, so I used the loop.
What the SpinWait does is ensure that only one thread can be in this code at a single time. Another thread will need to come along and add a WorkItem to one of the queues, of course.
There was a problem hiding this comment.
As I said, the code in question bothers me, but as you say it seems to be functionally equivalent. I am going to chalk it up to my feeling uncomfortable with the existing lockless implementation. You are correct that the two implementations are equivalent, but I think the second is more obvious, thus my disquiet.
Fixes #2335
This seems to fix it but requires some review since it's an area where we do not have tests. It's really not unit-testable and we have not set up an infrastructure for running integrated tests, especially where parallelis is involved. That said, I've tested a fair bit to ensure it does no harm. I ran with five levels of priority and generated random priorities for every work item, with only two test failures, which depended on the order of execution.
We now have a priority queue embedded in WorkItemQueue, implemented as an array of individual queues. There is no public access to this feature and the implementation is strictly limited to what we need for two levels of priority, one for OneTimeTearDown and one for everything else. I think we should be conservative in extending this and do it in very very small steps if we do.
I also did some refactoring to make it easier to implement the change. Notably, WorkItem is now responsible for calculating and remembering its own ExecutionStrategy.
I'd like to get this merged into a dev build and recruit some of the users who have reported the issue to try it out.