Skip to content

Rework span queued span completer#816

Merged
janstenpickle merged 9 commits intomasterfrom
queued-span-completer-contention
Nov 10, 2022
Merged

Rework span queued span completer#816
janstenpickle merged 9 commits intomasterfrom
queued-span-completer-contention

Conversation

@janstenpickle
Copy link
Copy Markdown
Collaborator

The queued span completer was causing contention around the Ref for
recording in flight requests. Seeing as this was only really used to
ensure that the completer can be shut down properly, there's a better
way of doing this new we have CE3.

The queued span completer was causing contention around the `Ref` for
recording in flight requests. Seeing as this was only really used to
ensure that the completer can be shut down properly, there's a better
way of doing this new we have CE3.
- Use FS2 channels and their `close` semantics to back the `Queued` implementations
- Make adding to the logging queue async, but putting `tryOffer` in a fire-and-forget fiber
for {
channel <- Resource.eval(Channel.bounded[F, CompletedSpan](realBufferSize))
errorQueue <- Resource.eval(Queue.bounded[F, Either[Channel.Closed, Boolean]](1))
_ <- Stream
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we wait for this stream to terminate, so we might never log that the Channel is closed. Is that a problem?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I've just added the log for completeness

exporter,
CompleterConfig(bufferSize = 50000, batchSize = 200)
).use { completer =>
List.fill(10000)(completer.complete(builder)).sequence
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth doing something here to ensure that these are definitely not all handled in one groupWithin? Or a separate test for that? Of course it should all be uncancelable but it would be nice to be sure that the tests covered that

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should all be handled in one groupWithin as the batchSize is just 200, the bufferSize here dictates how big the channel is, not the group.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be on the safe side, I'd suggest we might want to the number here to not be divided cleanly into batches (so 10001 for example). That way we verify that incomplete/partial batches are handled cleanly during shutdown

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, I'll change the tests

).use { completer =>
val randomRequestTime = for {
rand <- random.betweenDouble(0, 1)
_ <- IO.sleep((1.second * rand).asInstanceOf[FiniteDuration])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just jitter so we don't hammer it with everything at once?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I believe so, I think @sgjbryan wanted to improve this a bit

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we actually strictly need this jitter any more - was just part of me playing around to find a failing test for the old implementation. Expect this can be simplified a bit now

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I just removed it

@janstenpickle janstenpickle requested review from TimWSpence and sam-tombury and removed request for TimWSpence and sam-tombury November 10, 2022 13:43
@janstenpickle janstenpickle force-pushed the queued-span-completer-contention branch from 4e2966b to 4542f08 Compare November 10, 2022 13:45
Copy link
Copy Markdown

@TimWSpence TimWSpence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on Sam's test suggestion but LGTM!

@janstenpickle janstenpickle merged commit 258b5fc into master Nov 10, 2022
@janstenpickle janstenpickle deleted the queued-span-completer-contention branch November 10, 2022 15:23
@janstenpickle janstenpickle added the bug Something isn't working label Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants