warning for chunking gappy spools #538

ahmadtourei · 2025-09-15T23:22:38Z

Description

This PR addresses #509 by adding a warning when a gap is detected in the spool during chunking.

Checklist

I have (if applicable):

referenced the GitHub issue this PR closes.
documented the new feature with docstrings or appropriate doc page.
included a test. See testing guidelines.
your name has been added to the contributors page (docs/contributors.md).
added the "ready_for_review" tag once the PR is ready to be reviewed.

Summary by CodeRabbit

New Features
- Emits a UserWarning when gaps are detected but data is still treated as contiguous due to configured tolerance.
Documentation
- Clarified the tolerance parameter to specify the maximum allowable gap and explicitly define "gap."
Tests
- Updated tests to expect the new warning when processing gappy data while preserving existing behavior checks.

coderabbitai · 2025-09-15T23:22:45Z

Walkthrough

Docstring for BaseSpool.chunk clarified. ChunkManager now emits a UserWarning when a gap is detected but within the configured tolerance (message contains a literal "{self._tolerance}"). Tests updated to expect this warning when chunking gappy data. No API or behavioral logic changes.

Changes

Cohort / File(s)	Summary of Changes
Core docstring update `dascore/core/spool.py`	Clarified docstring for `BaseSpool.chunk` tolerance parameter to specify “maximum” gap and added “(gap)” clarification.
Chunk gap warning `dascore/utils/chunk.py`	Added `warnings` import and emit `UserWarning` when any gap is detected but considered contiguous due to tolerance; warning text includes a literal `"{self._tolerance}"` placeholder. Logic for group numbering remains unchanged.
Tests adjusted for warning `tests/test_utils/test_chunk.py`	Updated test to assert a `UserWarning` is raised when chunking gappy data (matching "There is a gap in the patch"). Duration verification remains unchanged.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch warning_chunk

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 825f962 and 55a1d38.

📒 Files selected for processing (1)

dascore/utils/chunk.py (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)

GitHub Check: test_code (windows-latest, 3.11)
GitHub Check: test_code (windows-latest, 3.10)
GitHub Check: test_code (ubuntu-latest, 3.12)
GitHub Check: test_code (windows-latest, 3.13)
GitHub Check: test_code (ubuntu-latest, 3.13)
GitHub Check: test_code (macos-latest, 3.12)
GitHub Check: test_code (macos-latest, 3.10)
GitHub Check: test_code (windows-latest, 3.12)
GitHub Check: test_code (ubuntu-latest, 3.10)
GitHub Check: test_code (ubuntu-latest, 3.11)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
GitHub Check: test_code_min_deps (macos-latest, 3.12)
GitHub Check: test_code (macos-latest, 3.11)
GitHub Check: test_code_min_deps (windows-latest, 3.13)
GitHub Check: test_code_min_deps (windows-latest, 3.12)
GitHub Check: test_code_min_deps (macos-latest, 3.13)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)

🔇 Additional comments (1)

dascore/utils/chunk.py (1)

5-5: LGTM: warnings import added where needed.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	The PR description follows the repository template by including "## Description" and the Checklist and clearly states the intent to address #509 by adding a warning when a gap is detected during chunking. It is concise and on-topic, giving reviewers the high-level purpose, but omits brief implementation details (e.g., which modules/tests were changed) and leaves checklist items unchecked. Because it links the issue and states the change, the description is mostly complete for review. Adding a short note about the modified files and the included test would improve clarity.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title Check	✅ Passed	The title "warning for chunking gappy spools" is concise and accurately summarizes the primary change in the PR — adding a warning when chunking spools/patches that contain gaps — and matches the code and test updates in the summary.

codecov · 2025-09-15T23:24:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.88%. Comparing base (13b2bf2) to head (55a1d38).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #538   +/-   ##
=======================================
  Coverage   99.88%   99.88%           
=======================================
  Files         122      122           
  Lines       10044    10048    +4     
=======================================
+ Hits        10032    10036    +4     
  Misses         12       12

Flag	Coverage Δ
unittests	`99.88% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

dascore/core/spool.py (1)
135-136: Clarify tolerance units; it's a multiple of the sampling interval, not a count of samples.

Current phrasing suggests an integer “number of samples.” In practice, tolerance is a scalar multiplier applied to the sampling step (can be non‑integer). Recommend rewording for accuracy.

Apply this diff:
-            The maximum number of samples a block of data can be spaced (gap) and
-            still be considered contiguous.
+            The maximum multiple of the sampling interval that a gap may be and
+            still be considered contiguous (e.g., 1.5 allows gaps up to 1.5 times
+            the step along the chunking dimension).
tests/test_utils/test_chunk.py (1)
261-263: Good: test now asserts a warning on gappy input.

This covers the new behavior.

Also consider asserting the same warning for the unordered gappy case to avoid stray uncaptured warnings in stricter CI configs:
# in test_doesnt_merge_unordered_gappy_df
with pytest.warns(UserWarning, match=r"There is a gap in the patch"):
    _, out = cm.chunk(df)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13b2bf2 and 825f962.

📒 Files selected for processing (3)

dascore/core/spool.py (1 hunks)
dascore/utils/chunk.py (2 hunks)
tests/test_utils/test_chunk.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/test_utils/test_chunk.py (2)

dascore/core/spool.py (2)

chunk (111-158)

chunk (511-541)

dascore/utils/chunk.py (1)

chunk (428-470)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)

GitHub Check: test_code (macos-latest, 3.12)
GitHub Check: test_code (macos-latest, 3.10)
GitHub Check: test_code (windows-latest, 3.11)
GitHub Check: test_code (ubuntu-latest, 3.10)
GitHub Check: test_code (ubuntu-latest, 3.11)
GitHub Check: test_code (windows-latest, 3.12)
GitHub Check: test_code_min_deps (windows-latest, 3.13)
GitHub Check: test_code (windows-latest, 3.10)
GitHub Check: test_code (macos-latest, 3.13)
GitHub Check: test_code (ubuntu-latest, 3.13)
GitHub Check: test_code (windows-latest, 3.13)
GitHub Check: test_code (macos-latest, 3.11)
GitHub Check: test_code (ubuntu-latest, 3.12)
GitHub Check: test_code_min_deps (macos-latest, 3.13)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
GitHub Check: test_code_min_deps (windows-latest, 3.12)
GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)

🔇 Additional comments (1)

dascore/utils/chunk.py (1)

5-5: LGTM: warnings import added where needed.

Import is scoped appropriately in this module.

coderabbitai · 2025-09-15T23:28:53Z

dascore/utils/chunk.py

+        if bool(np.asarray(has_gap).any()):
+            msg = (
+                "There is a gap in the patch. As a result, some patches in the "
+                "chunked spool may be unevenly sampled. However, they are "
+                "still considered contiguous as a tolerance of {self._tolerance} "
+                "was used in the chunk function."
+            )
+            warnings.warn(msg, UserWarning, stacklevel=2)


⚠️ Potential issue

Fix unformatted placeholder in warning and tighten check; consider better stacklevel.

{self._tolerance} is not interpolated; the message will literally show the braces.

has_gap.any() is simpler and avoids unnecessary numpy conversion.

stacklevel=2 points at an internal helper; bumping helps users see the call site (e.g., ChunkManager.chunk).

Apply this diff:

- if bool(np.asarray(has_gap).any()): - msg = ( - "There is a gap in the patch. As a result, some patches in the " - "chunked spool may be unevenly sampled. However, they are " - "still considered contiguous as a tolerance of {self._tolerance} " - "was used in the chunk function." - ) - warnings.warn(msg, UserWarning, stacklevel=2) + if has_gap.any(): + msg = ( + f"There is a gap in the patch. As a result, some patches in the " + f"chunked spool may be unevenly sampled. However, they are still " + f"considered contiguous because a tolerance of {self._tolerance} " + f"times the {self._name}_step was used in the chunk function." + ) + warnings.warn(msg, UserWarning, stacklevel=3)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In dascore/utils/chunk.py around lines 204 to 211, the warning message uses an unformatted placeholder "{self._tolerance}" (so it prints literally), converts has_gap to a numpy array unnecessarily, and uses stacklevel=2 which points to an internal helper instead of the calling site; fix by changing the condition to use has_gap.any(), format the message to interpolate self._tolerance (e.g., f-string or .format) so the actual tolerance value appears, and increase the warnings.warn stacklevel (e.g., to 3 or 4) so the warning points at the user call site such as ChunkManager.chunk.

d-chambers

LGTM

d-chambers · 2025-09-16T06:06:34Z

As an aside, code rabbit is starting to be quite helpful isn't it?

ahmadtourei · 2025-09-16T13:47:48Z

As an aside, code rabbit is starting to be quite helpful isn't it?

Yep!

warning for gappy patches

825f962

ahmadtourei added the spool related to Spool class label Sep 15, 2025

coderabbitai bot reviewed Sep 15, 2025

View reviewed changes

address coderabbit

55a1d38

ahmadtourei changed the title ~~warning for gappy patches~~ warning for chunking gappy spools Sep 15, 2025

d-chambers approved these changes Sep 16, 2025

View reviewed changes

ahmadtourei merged commit c2c17a1 into master Sep 16, 2025
22 checks passed

ahmadtourei deleted the warning_chunk branch September 16, 2025 13:46

ahmadtourei mentioned this pull request Sep 16, 2025

patch.coords.step("time") returns None #509

Closed

d-chambers mentioned this pull request Sep 24, 2025

add f string to print gap info #554

Merged

5 tasks

coderabbitai bot mentioned this pull request Nov 20, 2025

fix #583 #584

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

warning for chunking gappy spools #538

warning for chunking gappy spools #538

Uh oh!

ahmadtourei commented Sep 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 15, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 15, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 15, 2025

Uh oh!

d-chambers left a comment

Uh oh!

d-chambers commented Sep 16, 2025

Uh oh!

Uh oh!

ahmadtourei commented Sep 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

warning for chunking gappy spools #538

warning for chunking gappy spools #538

Uh oh!

Conversation

ahmadtourei commented Sep 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Pre-merge checks

Uh oh!

codecov bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

d-chambers left a comment

Choose a reason for hiding this comment

Uh oh!

d-chambers commented Sep 16, 2025

Uh oh!

Uh oh!

ahmadtourei commented Sep 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ahmadtourei commented Sep 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 15, 2025 •

edited

Loading

codecov bot commented Sep 15, 2025 •

edited

Loading