fix: batch-level FPS tracking for smooth frame delivery by ryanontheinside · Pull Request #617 · daydreamlive/scope

ryanontheinside · 2026-03-07T18:11:06Z

Replace per-frame delta FPS tracking with batch-level throughput measurement (sum(frames) / sum(intervals)) in PipelineProcessor.

The old approach measured inter-frame deltas, but in batched pipelines (e.g., 12 frames emitted at once), near-zero intra-batch deltas mixed with large inter-batch gaps caused the FPS estimate to oscillate permanently. This produced unstable frame pacing downstream.

The new approach records one sample per pipeline call as a (num_frames, interval) tuple, then computes FPS over a sliding window of 10 batches. This gives a stable, accurate estimate regardless of batch size.

Benchmark results (10s runs, real PipelineProcessor with stub pipelines):

Scenario	Metric	Old	New
LongLive (batch=12)	Jitter	84.1%	0.3%
	Stalls (>1.5x)	7.9%	0.0%
	FPS error	34.3%	0.2%
	FPS stdev	7.23	0.01
SDv2 (batch=4)	Jitter	29.1%	5.8%
	Stalls (>1.5x)	4.6%	0.0%
	FPS error	13.8%	1.0%
	FPS stdev	5.24	0.69
Single-frame	Jitter	2.3%	4.5%
	FPS error	2.7%	2.7%
	FPS stdev	1.89	1.35
Pause/resume	Jitter	112.9%	0.3%
	Stalls (>1.5x)	7.3%	0.0%
	FPS error	53.8%	0.2%
	FPS stdev	12.10	0.01
Single-frame results are equivalent, confirming no regression for non-batching pipelines.

Summary by CodeRabbit

Refactor
- Improved performance metrics tracking by switching to batch-based FPS calculation, reducing per-frame overhead.
Tests
- Added comprehensive frame delivery validation tests covering varying batch sizes, processing delays, and pause/resume scenarios.

Replace per-frame delta FPS tracking with batch-level throughput measurement (sum(frames) / sum(intervals)) in PipelineProcessor. The old approach measured inter-frame deltas, but in batched pipelines (e.g., 12 frames emitted at once), near-zero intra-batch deltas mixed with large inter-batch gaps caused the FPS estimate to oscillate permanently. This produced unstable frame pacing downstream. The new approach records one sample per pipeline call as a (num_frames, interval) tuple, then computes FPS over a sliding window of 10 batches. This gives a stable, accurate estimate regardless of batch size. Benchmark results (10s runs, real PipelineProcessor with stub pipelines): Scenario | Metric | Old | New --------------------|----------------|---------|-------- LongLive (batch=12) | Jitter | 84.1% | 0.3% | Stalls (>1.5x) | 7.9% | 0.0% | FPS error | 34.3% | 0.2% | FPS stdev | 7.23 | 0.01 SDv2 (batch=4) | Jitter | 29.1% | 5.8% | Stalls (>1.5x) | 4.6% | 0.0% | FPS error | 13.8% | 1.0% | FPS stdev | 5.24 | 0.69 Single-frame | Jitter | 2.3% | 4.5% | FPS error | 2.7% | 2.7% | FPS stdev | 1.89 | 1.35 Pause/resume | Jitter | 112.9% | 0.3% | Stalls (>1.5x) | 7.3% | 0.0% | FPS error | 53.8% | 0.2% | FPS stdev | 12.10 | 0.01 Single-frame results are equivalent, confirming no regression for non-batching pipelines. Signed-off-by: RyanOnTheInside <[email protected]>

coderabbitai · 2026-03-07T18:11:20Z

📝 Walkthrough

Walkthrough

Refactored FPS tracking in PipelineProcessor from per-frame delta measurements to batch-based throughput sampling. Replaced frame-level state tracking with batch accumulation (num_frames, interval) for improved averaging. Added comprehensive test suite validating frame delivery under various batch sizes, delays, and pause/resume scenarios.

Changes

Cohort / File(s)	Summary
FPS Tracking Refactoring `src/scope/server/pipeline_processor.py`	Replaced per-frame FPS logic with batch-based sampling. Swapped `_track_output_frame()` for `_track_output_batch(num_frames, processing_time)`. Changed constants from `OUTPUT_FPS_SAMPLE_SIZE`/`OUTPUT_FPS_MIN_SAMPLES` to `BATCH_FPS_SAMPLE_SIZE`. Updated internal state: `_batch_samples` deque and `_last_batch_time` replacing frame-level tracking. Revised pause handling and FPS calculation to operate on accumulated batch data.
Frame Delivery Test Suite `tests/test_frame_delivery.py`	New test module with `TestFrameDelivery` class containing four test methods covering large batch, small batch, single-frame, and pause/resume delivery scenarios. Includes `StubPipeline` helper for simulating configurable processing delays and `_measure_delivery()` orchestrator that captures FPS, jitter, stalls, and timing metrics across varying conditions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Batches now flow smooth as clover,
No more frame-by-frame we hover,
Throughput tallied, intervals measured true,
Delivery tested in every hue,
Pipeline hops with newfound pace! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.92% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: replacing per-frame FPS tracking with batch-level FPS tracking, which directly addresses the core problem of smooth frame delivery in batched scenarios.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/smooth-frame-pacing

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/scope/server/pipeline_processor.py (1)

510-525: ⚠️ Potential issue | 🟠 Major

Track only frames that actually made it into output_queue.

num_frames is the tensor batch size, but Lines 516-522 can drop frames on queue.Full. Feeding the full batch into _track_output_batch() inflates current_output_fps under backpressure, so consumers pace against frames they will never see.

Suggested fix

-            for frame in output:
+            enqueued_frames = 0
+            for frame in output:
                 frame = frame.unsqueeze(0)
                 try:
                     self.output_queue.put_nowait(frame)
+                    enqueued_frames += 1
                 except queue.Full:
                     logger.debug(
                         f"Output queue full for {self.pipeline_id}, dropping processed frame"
                     )
                     continue

             # Track batch-level throughput for FPS calculation
-            self._track_output_batch(num_frames, processing_time)
+            if enqueued_frames > 0:
+                self._track_output_batch(enqueued_frames, processing_time)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/pipeline_processor.py` around lines 510 - 525, The code
currently calls self._track_output_batch(num_frames, processing_time) using the
original tensor batch size even though some frames may be dropped on queue.Full;
change the logic in the output loop (the for frame in output block that calls
self.output_queue.put_nowait(frame) for pipeline_id) to count only successfully
enqueued frames (increment a local counter when put_nowait succeeds) and pass
that successful_frames count to self._track_output_batch(successful_frames,
processing_time) instead of num_frames so throughput/FPS reflects only frames
actually delivered.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_frame_delivery.py`:
- Around line 39-46: The tests rely on real time via the helper
_measure_delivery which makes them slow and flaky; modify _measure_delivery to
accept injectable time and sleep callables (e.g., time_fn and sleep_fn) and use
those instead of time.time()/time.sleep(), then update its callers to pass a
deterministic fake clock/sleeper in unit tests (keeping the real time functions
only for an explicit integration/slow test). Alternatively, mark the existing
tests with a pytest marker like pytest.mark.slow or pytest.mark.integration so
CI can skip them by default; reference the _measure_delivery helper and all its
callers in this file when applying the change.

---

Outside diff comments:
In `@src/scope/server/pipeline_processor.py`:
- Around line 510-525: The code currently calls
self._track_output_batch(num_frames, processing_time) using the original tensor
batch size even though some frames may be dropped on queue.Full; change the
logic in the output loop (the for frame in output block that calls
self.output_queue.put_nowait(frame) for pipeline_id) to count only successfully
enqueued frames (increment a local counter when put_nowait succeeds) and pass
that successful_frames count to self._track_output_batch(successful_frames,
processing_time) instead of num_frames so throughput/FPS reflects only frames
actually delivered.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 362446c8-a606-40c5-b09b-2363c385aab7

📥 Commits

Reviewing files that changed from the base of the PR and between 0a52f7c and be5befe.

📒 Files selected for processing (2)

src/scope/server/pipeline_processor.py
tests/test_frame_delivery.py

coderabbitai · 2026-03-07T18:15:40Z

tests/test_frame_delivery.py

+def _measure_delivery(
+    batch_size: int,
+    delay: float,
+    duration: float,
+    variance: float = 0.0,
+    pause_at: float | None = None,
+    pause_duration: float = 2.0,
+) -> dict:


⚠️ Potential issue | 🟠 Major

These tests are likely to be slow and flaky in CI.

The helper measures jitter from real time.sleep() / time.time() behavior, and the four callers keep the process alive for about 45 seconds total. On a busy runner, those thresholds can fail because of scheduler noise rather than an FPS regression. Please move this behind a slow/integration marker or inject a fake clock/sleeper so the pacing math is deterministic.

Also applies to: 68-121, 137-196

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_frame_delivery.py` around lines 39 - 46, The tests rely on real time via the helper _measure_delivery which makes them slow and flaky; modify _measure_delivery to accept injectable time and sleep callables (e.g., time_fn and sleep_fn) and use those instead of time.time()/time.sleep(), then update its callers to pass a deterministic fake clock/sleeper in unit tests (keeping the real time functions only for an explicit integration/slow test). Alternatively, mark the existing tests with a pytest marker like pytest.mark.slow or pytest.mark.integration so CI can skip them by default; reference the _measure_delivery helper and all its callers in this file when applying the change.

github-actions · 2026-03-07T18:40:31Z

🚀 fal.ai Preview Deployment


App ID	`daydream/scope-pr-617--preview`
WebSocket	`wss://fal.run/daydream/scope-pr-617--preview/ws`
Commit	`71791c2`

Testing

Connect to this preview deployment by running this on your branch:

uv run build && SCOPE_CLOUD_APP_ID="daydream/scope-pr-617--preview/ws" uv run daydream-scope

🧪 E2E tests will run automatically against this deployment.

github-actions · 2026-03-07T18:44:25Z

✅ E2E Tests passed


Status	passed
fal App	`daydream/scope-pr-617--preview`
Run	View logs

Test Artifacts

Check the workflow run for screenshots.

leszko

I checked and this change caused longlive FPS drop from 18.8 -> 17.2 FPS. I wonder what is the cause, maybe related to the fact that now we measure time after all frames pass frame = frame.unsqueeze(0). Or maybe it's something about the fallback mechanism, now we fallback to the processing_time which may be lower when the pipeline warms up.

Definitely something we need to investigate before merging this PR.

But conceptually, it looks fine.

ryanontheinside · 2026-03-09T10:37:23Z

The lower FPS you're seeing is expected. The old per-frame delta tracking was overestimating FPS for batched pipelines.

The root cause is in next_timestamp() in tracks.py (line 81), which actively sleeps based on frame_ptime = 1.0 / get_fps(). When get_fps() returns an inflated number, frames get paced faster than they're actually produced.

Here's why the old code inflates FPS for batched pipelines: when 12 frames are enqueued in a tight loop, 11 of the 30 sampled deltas are near-zero (intra-batch), which drags the average delta down and pushes the FPS estimate up.

Simulating both algorithms against identical production patterns:

LongLive (batch=12, 0.5s per batch, true throughput = 24 FPS):

Old estimate: 30 FPS (25% inflation), oscillates between 20-30
New estimate: 24 FPS (accurate), rock steady
Old code makes next_timestamp() sleep 8.3ms less per frame than it should

SDv2 (batch=4, 0.2s per batch, true throughput = 20 FPS):

Old estimate: 20.4 FPS (1.8% inflation), oscillates between 18.8-21.4
New estimate: 20 FPS (accurate), steady

Single-frame (no batching, true throughput = 30 FPS):

Both algorithms report 30.3 FPS, identical behavior

The pipeline isn't producing frames any slower. The old number was inflated by burst delivery, and the client-side WebRTC stats were averaging over those bursts. The new code paces frames evenly at the true production rate, which looks smoother but shows a lower (accurate) number.

Signed-off-by: Rafal Leszko <[email protected]>

leszko · 2026-03-09T15:05:47Z

The lower FPS you're seeing is expected. The old per-frame delta tracking was overestimating FPS for batched pipelines.

The root cause is in next_timestamp() in tracks.py (line 81), which actively sleeps based on frame_ptime = 1.0 / get_fps(). When get_fps() returns an inflated number, frames get paced faster than they're actually produced.

Here's why the old code inflates FPS for batched pipelines: when 12 frames are enqueued in a tight loop, 11 of the 30 sampled deltas are near-zero (intra-batch), which drags the average delta down and pushes the FPS estimate up.

Simulating both algorithms against identical production patterns:

LongLive (batch=12, 0.5s per batch, true throughput = 24 FPS):

Old estimate: 30 FPS (25% inflation), oscillates between 20-30

New estimate: 24 FPS (accurate), rock steady

Old code makes next_timestamp() sleep 8.3ms less per frame than it should

SDv2 (batch=4, 0.2s per batch, true throughput = 20 FPS):

Old estimate: 20.4 FPS (1.8% inflation), oscillates between 18.8-21.4

New estimate: 20 FPS (accurate), steady

Single-frame (no batching, true throughput = 30 FPS):

Both algorithms report 30.3 FPS, identical behavior

The pipeline isn't producing frames any slower. The old number was inflated by burst delivery, and the client-side WebRTC stats were averaging over those bursts. The new code paces frames evenly at the true production rate, which looks smoother but shows a lower (accurate) number.

Ok, I reduced the OUTPUT_QUEUE_MAX_SIZE_FACTOR from 3 to 2. I think it's ok for now. This makes the latency slightly better.

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

leszko requested changes Mar 9, 2026

View reviewed changes

Reduce OUTPUT_QUEUE_MAX_SIZE_FACTOR to 2

71791c2

Signed-off-by: Rafal Leszko <[email protected]>

leszko approved these changes Mar 9, 2026

View reviewed changes

leszko merged commit 66e8d31 into main Mar 9, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: batch-level FPS tracking for smooth frame delivery#617

fix: batch-level FPS tracking for smooth frame delivery#617
leszko merged 2 commits intomainfrom
fix/smooth-frame-pacing

ryanontheinside commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 7, 2026

Uh oh!

github-actions bot commented Mar 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 7, 2026 •

edited

Loading

Uh oh!

leszko left a comment

Uh oh!

ryanontheinside commented Mar 9, 2026

Uh oh!

leszko commented Mar 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ryanontheinside commented Mar 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 fal.ai Preview Deployment

Testing

Uh oh!

github-actions bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ E2E Tests passed

Test Artifacts

Uh oh!

leszko left a comment

Choose a reason for hiding this comment

Uh oh!

ryanontheinside commented Mar 9, 2026

Uh oh!

leszko commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryanontheinside commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading

github-actions bot commented Mar 7, 2026 •

edited

Loading

github-actions bot commented Mar 7, 2026 •

edited

Loading

leszko commented Mar 9, 2026 •

edited

Loading