feat(spans): Add FlusherLogger to track top flush operations by bytes#108266
feat(spans): Add FlusherLogger to track top flush operations by bytes#108266
Conversation
fpacifici
left a comment
There was a problem hiding this comment.
Please see the comment inline, specifically let's try to avoid the cross process communication for logging.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
fpacifici
left a comment
There was a problem hiding this comment.
Please consider addressing the comments before merging.
src/sentry/spans/buffer.py
Outdated
| if logger_enabled: | ||
| ids_start = time.monotonic() |
There was a problem hiding this comment.
nit: You do not need the option to be enabled to collect the time, this is safe enough and avoiding the if block makes it more concise.
There was a problem hiding this comment.
removed the if block. I wasn't sure if we're OK with the extra work to collect the time even if we disable the option, but I guess it is negligible enough
src/sentry/spans/buffer.py
Outdated
| if logger_enabled and segment: | ||
| try: | ||
| project_id, trace_id, _ = parse_segment_key(segment_key) | ||
| project_and_trace = f"{project_id.decode('ascii')}:{trace_id.decode('ascii')}" | ||
| flusher_log_entries.append( | ||
| FlusherLogEntry( | ||
| project_and_trace, | ||
| len(segment), | ||
| sum(len(s) for s in segment), | ||
| ) | ||
| ) | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Is there any reason not to put this inside the flusher logger?
AS this method is very long and complex, it is really important to try to keep outside everything not strictly needed to perform the business logic.
Please consider factoring this out.
There was a problem hiding this comment.
I put this here since parse_segment_key is also defined in this file.
Is it okay to move that function to a utils module of some sort so we can avoid the circular import (buffer -> buffer_logger -> buffer)?
There was a problem hiding this comment.
I would not call it util, but feel free to keep where it is and disregards this request
|
|
||
| del payloads[key] | ||
| del cursors[key] | ||
| del sizes[key] |
There was a problem hiding this comment.
I realized since we are dropping the segment and deleting span payload, next scan position, so we could might as well delete the byte size total of that segment (sizes[key])
src/sentry/spans/buffer_logger.py
Outdated
| """ | ||
|
|
||
| def __init__(self) -> None: | ||
| self._data: dict[str, tuple[int, int, int]] = {} |
There was a problem hiding this comment.
Please define a type for this tuple[int, int, int], it is hard to tell what it represents.
Moreover the self._data name makes it even harder to tell what this dict represents. Give it a more specific name. like self._trace_latencies
There was a problem hiding this comment.
thanks, I made a NamedTuple for FlusherLogEntry but missed this one. I refactored buffer_logger._data and flusher_logger._data to both be NamedTuple for clarity.
…#108266) Adds `FlusherLogger` to track per-trace flush operations and log the top 50 traces by cumulative bytes flushed every 60 seconds.
…#108266) Adds `FlusherLogger` to track per-trace flush operations and log the top 50 traces by cumulative bytes flushed every 60 seconds.
Adds
FlusherLoggerto track per-trace flush operations and log the top 50 traces by cumulative bytes flushed every 60 seconds.