Skip to content

Conversation

@d-chambers
Copy link
Contributor

@d-chambers d-chambers commented Sep 24, 2025

Description

This PR replaced the old ASV benchmarks, which were such a pain to use they were rarely run, with codspeed benchmarks, which can automatically be run in the CI.

Checklist

I have (if applicable):

  • referenced the GitHub issue this PR closes.
  • documented the new feature with docstrings or appropriate doc page.
  • included a test. See testing guidelines.
  • your name has been added to the contributors page (docs/contributors.md).
  • added the "ready_for_review" tag once the PR is ready to be reviewed.

Summary by CodeRabbit

  • New Features

    • Continuous performance monitoring via CodSpeed with a new CI workflow and profile mode.
  • Tests

    • Replaced legacy ASV suites with pytest/CodSpeed benchmarks covering I/O, patch processing, transforms, visualization, aggregation, rolling, and spool scenarios; added robust skip helpers for missing deps/timeouts.
  • Documentation

    • Updated benchmarking docs and README; added CodSpeed and contributors badges.
  • Chores

    • Removed ASV configs, scripts, and utilities; updated test profiles, CI test script, and .gitignore.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 24, 2025

Walkthrough

Removes ASV configuration, scripts, and legacy ASV benchmark modules; adds pytest‑CodSpeed benchmark tests and fixtures; updates docs, CI workflow, pyproject profiles, and .gitignore; and replaces the test helper skip_missing_dependency() with two context managers skip_missing() and skip_timeout() in tests.

Changes

Cohort / File(s) Change Summary
Remove ASV config & orchestration scripts
asv.conf.json, scripts/_benchmark_uilts.py, scripts/run_benchmarks.py, scripts/visualize_benchmarks.py
Deleted ASV configuration and all scripts/utilities used to run, compare, and visualize ASV benchmarks (including git/context helpers and asv orchestration).
Remove legacy ASV benchmark modules
benchmarks/generic_io_benchmarks.py, benchmarks/memory_spool_benchmarks.py, benchmarks/patch_benchmarks.py, benchmarks/__init__.py
Removed legacy ASV benchmark suites, module docstring/annotations, and all related classes/functions.
Add pytest‑CodSpeed benchmarks
benchmarks/test_io_benchmarks.py, benchmarks/test_patch_benchmarks.py, benchmarks/test_spool_benchmarks.py
Added pytest-based benchmark modules with session fixtures, helpers, and many benchmark tests for IO, patch, and spool operations (pytest-codspeed/pytest-benchmark style).
Documentation updates
benchmarks/readme.md, docs/contributing/profiling_benchmarks.qmd, docs/contributors.qmd, readme.md
Replaced ASV-focused docs with CodSpeed/pytest instructions, updated profiling guidance, badges, and migration notes; removed a contributors badge image.
CI / tooling changes
.github/workflows/profile.yml, .github/test_code.sh, pyproject.toml, .gitignore
Added CodSpeed GitHub Actions workflow; added "profile" mode to test script; updated pyproject profiles to use pytest-codspeed/pytest-profiling; added .codspeed and prof/ to .gitignore.
Tests: helper rename / split
tests/test_io/test_common_io.py
Replaced skip_missing_dependency() with two context managers skip_missing() and skip_timeout(); updated imports and wrapped IO/test fetches to use the new helpers.

Possibly related PRs

  • add Sintela binary v3 support #516 — Modifies tests/test_io/test_common_io.py (adds SintelaBinaryV3 to COMMON_IO_READ_TESTS); touches the same test helper area as this change.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the primary change by indicating the adoption of CodSpeed for performance benchmarks. It directly reflects the main goal of replacing the previous benchmarking tool and is free of extraneous details. Its brevity and clarity enable team members to understand the PR’s purpose at a glance.
Description Check ✅ Passed The pull request description follows the repository’s provided template by including both the “## Description” and “## Checklist” sections with appropriate content. The description clearly explains the replacement of ASV benchmarks with CodSpeed and the CI integration. The checklist is present even if items are left unchecked, satisfying the required structure. Therefore, the description is mostly complete and adheres to the template.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codspeed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added documentation Improvements or additions to documentation ready_for_review PR is ready for review labels Sep 24, 2025
@codecov
Copy link

codecov bot commented Sep 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.92%. Comparing base (b115a76) to head (07df64b).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #552   +/-   ##
=======================================
  Coverage   99.92%   99.92%           
=======================================
  Files         126      126           
  Lines       10472    10472           
=======================================
  Hits        10464    10464           
  Misses          8        8           
Flag Coverage Δ
unittests 99.92% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
.codspeed/results_1758699027159.json (1)

1-1287: Do not commit CodSpeed result artifacts (repo bloat and PII).

This file contains local env paths (e.g., /home/derrick/...), toolchain details, and hundreds of KB of data. Results should be uploaded by CodSpeed and not tracked in git.

  • Remove .codspeed/*.json from the repo.
  • Add .codspeed/ to .gitignore.
  • Let CI publish results via CodSpeed action.

Proposed .gitignore entry (root):

+.codspeed/
.codspeed/results_1758699590867.json (1)

1-1287: Remove committed benchmark results; ignore .codspeed directory.

Same concerns as other result dumps: repository bloat and leakage of local environment paths. Store these externally (CodSpeed dashboard), not in git.

Add to .gitignore:

+.codspeed/
.codspeed/results_1758699042297.json (1)

1-1287: Remove committed CodSpeed results (.codspeed/) and ignore the directory

These .codspeed/*.json files contain local absolute paths and environment info — remove them from the PR (delete/tracked files) and add the directory to .gitignore.

Diff to add:

+.codspeed/

Also: URIs are mixed — two results reference tests/benchmarks/... (.codspeed/results_1758699042297.json, .codspeed/results_1758699027159.json) while others use benchmarks/... — standardize to a single canonical path.

🧹 Nitpick comments (15)
.gitignore (1)

86-86: Optional: anchor to repo root

To avoid matching nested dirs, anchor the ignore to root and ignore contents.

-.codspeed
+/.codspeed
+# or, if you prefer explicit contents
+/.codspeed/**
pyproject.toml (3)

76-85: Duplicate pytest-cov entry.

pytest-cov appears twice (>=4 and unpinned). Keep one spec to avoid confusion.

Apply this diff:

 test = [
     "coverage>=7.4,<8",
     "pytest-cov>=4",
     "pre-commit",
     "pytest",
     "pytest-codeblocks",
-    "pytest-cov",
+    # pytest-cov already included above
     "pytest-codspeed",
     "twine",
 ]

83-83: Pin pytest-codspeed minimum version.

CI workflow uses CodSpeedHQ/action@v4 which aligns with pytest-codspeed v4 behavior. Pin a compatible lower bound.

-    "pytest-codspeed",
+    "pytest-codspeed>=4",

200-205: Register pytest marker to silence warnings.

If using a custom marker (benchmark/codspeed) in tests, register it to avoid Pytest unknown marker warnings.

Example (adjust marker name to match actual usage):

 [tool.pytest.ini_options]
 filterwarnings = [
     # Ignore hdf5 warnings from pytables, See pytables #1035
     'ignore::Warning:tables:'
 ]
+markers = [
+    "benchmark: performance benchmark tracked by CodSpeed",
+]
benchmarks/readme.md (1)

35-39: Align branch naming with workflow triggers.

Docs say “main/master”, but the workflow triggers only on master. Either add main to the workflow or update docs for consistency.

- - Push to main/master branch
+ - Push to master branch

Or add main to workflow.

.github/workflows/codspeed.yml (3)

19-26: Avoid mixing Python 3.11 setup with a separate 3.13 conda env.

actions/setup-python@v5 installs 3.11, then you install a 3.13 mamba env. The first step is likely redundant and can confuse PATH/venv resolution.

  • Remove actions/setup-python if you run in the mamba env.
  • Or make both versions consistent.
-      - uses: actions/setup-python@v5
-        with:
-          python-version: "3.11"

27-33: Ensure pytest runs inside the conda/mamba env.

Composite actions don’t persist “activated” envs across steps unless they export PATH/env. CodSpeed action runs a command; if the env isn’t active, it may use system Python.

  • Either ensure the mamba action exports PATH for subsequent steps
  • Or invoke pytest explicitly via conda run.

Example using conda run:

       - name: Run benchmarks
         uses: CodSpeedHQ/action@v4
         with:
           mode: instrumentation
-          run: pytest benchmarks/ --codspeed
+          run: conda run -n dascore pytest benchmarks/ --codspeed
           token: ${{ secrets.CODSPEED_TOKEN }}  # Optional for public repos

Adjust env name to match your installer action.

Please confirm the composite action activates the env for later steps.


10-13: Optional: cancel superseded runs on PRs.

Add a concurrency group to reduce CI load for fast‑iterating PRs.

 jobs:
   benchmarks:
     name: Run benchmarks
     runs-on: ubuntu-latest
+    concurrency:
+      group: codspeed-${{ github.workflow }}-${{ github.ref }}
+      cancel-in-progress: true
benchmarks/test_io_benchmarks.py (4)

5-5: Use lru_cache for broader Python compatibility, or confirm min version.

functools.cache requires Python ≥3.9. If you support older versions, replace with lru_cache(maxsize=None).

-from functools import cache
+from functools import lru_cache as cache

16-18: Minor: avoid iterrows overhead.

Use vectorized name access or itertuples for faster/lighter iteration.

-    df = get_registry_df().loc[lambda x: ~x["name"].str.endswith(".csv")]
-    out = {row["name"]: fetch(row["name"]) for _, row in df.iterrows()}
+    df = get_registry_df().loc[lambda x: ~x["name"].str.endswith(".csv")]
+    names = df["name"].tolist()
+    out = {name: fetch(name) for name in names}

30-31: Disable progress reporting in tight loops to reduce overhead.

scan supports a progress argument; disabling it yields cleaner timing.

-        dc.scan(path)
+        dc.scan(path, progress="off")

37-38: Disable progress for scan_to_df as well.

-        dc.scan_to_df(path)
+        dc.scan_to_df(path, progress="off")
benchmarks/test_patch_benchmarks.py (1)

12-16: Consider session-scoping and seeding to stabilize timings.

Creating a fresh random example per test can add variance. Optionally use scope="session" and seed RNG to reduce noise.

-@pytest.fixture
-def example_patch():
+@pytest.fixture(scope="session")
+def example_patch():
     """Get the example patch for benchmarks."""
-    return dc.get_example_patch()
+    np.random.seed(0)
+    return dc.get_example_patch()
benchmarks/test_spool_benchmarks.py (2)

28-36: Helper asserts are useful; minor rename for clarity.

Function chunks rather than merges; consider renaming to _chunk_and_check for consistency with its use/comments.

-    """Helper function to merge and check the spool."""
+    """Helper function to chunk and check the spool."""

41-45: Docstring wording mismatch (“merge” vs chunk).

Bench is chunking, not merging; update wording for clarity.

-    """Time merging contiguous patches from in-memory spool."""
+    """Time chunking contiguous patches from in-memory spool."""
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b115a76 and 5d5077c.

📒 Files selected for processing (20)
  • .codspeed/results_1758699027159.json (1 hunks)
  • .codspeed/results_1758699042297.json (1 hunks)
  • .codspeed/results_1758699590867.json (1 hunks)
  • .codspeed/results_1758700287344.json (1 hunks)
  • .github/workflows/codspeed.yml (1 hunks)
  • .gitignore (1 hunks)
  • asv.conf.json (0 hunks)
  • benchmarks/__init__.py (0 hunks)
  • benchmarks/generic_io_benchmarks.py (0 hunks)
  • benchmarks/memory_spool_benchmarks.py (0 hunks)
  • benchmarks/patch_benchmarks.py (0 hunks)
  • benchmarks/readme.md (1 hunks)
  • benchmarks/test_io_benchmarks.py (1 hunks)
  • benchmarks/test_patch_benchmarks.py (1 hunks)
  • benchmarks/test_spool_benchmarks.py (1 hunks)
  • pyproject.toml (1 hunks)
  • readme.md (1 hunks)
  • scripts/_benchmark_uilts.py (0 hunks)
  • scripts/run_benchmarks.py (0 hunks)
  • scripts/visualize_benchmarks.py (0 hunks)
💤 Files with no reviewable changes (8)
  • benchmarks/init.py
  • scripts/_benchmark_uilts.py
  • benchmarks/memory_spool_benchmarks.py
  • benchmarks/generic_io_benchmarks.py
  • benchmarks/patch_benchmarks.py
  • scripts/visualize_benchmarks.py
  • asv.conf.json
  • scripts/run_benchmarks.py
🧰 Additional context used
🧬 Code graph analysis (3)
benchmarks/test_io_benchmarks.py (2)
dascore/utils/downloader.py (2)
  • fetch (40-58)
  • get_registry_df (28-36)
dascore/io/core.py (1)
  • scan_to_df (654-702)
benchmarks/test_patch_benchmarks.py (3)
dascore/utils/patch.py (1)
  • get_start_stop_step (433-440)
dascore/examples.py (1)
  • get_example_patch (659-706)
dascore/proc/resample.py (1)
  • interpolate (96-153)
benchmarks/test_spool_benchmarks.py (1)
dascore/examples.py (1)
  • get_example_spool (710-756)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: Run benchmarks
🔇 Additional comments (7)
.gitignore (1)

86-86: Fix typo: use .codspeed (no “e”)

CodSpeed writes artifacts to .codspeed. Current .codespeed won’t ignore them.

Apply this diff:

-.codespeed
+.codspeed

Likely an incorrect or invalid review comment.

benchmarks/readme.md (1)

30-31: Keep @pytest.mark.benchmark — matches pytest-codspeed v4

pytest-codspeed v4 recommends the "benchmark" marker (use @pytest.mark.benchmark for whole-test) and also provides a benchmark fixture or an @benchmark decorator for line-level benchmarking.

readme.md (1)

6-6: CodSpeed badge endpoint incorrect — include org/repo

Replace the badge URL in readme.md (line 6) with the repo-specific endpoint: https://img.shields.io/endpoint?url=https://codspeed.io///badge.json — e.g. https://img.shields.io/endpoint?url=https://codspeed.io/DASDAE/dascore/badge.json

.codspeed/results_1758700287344.json (1)

1049-1249: ASV packages appear in the recorded environment; verify they’re removed from project deps.

The captured env lists asv and asv_runner, which conflicts with the PR’s goal to replace ASV. Ensure pyproject.toml no longer declares these (dev) dependencies and CI doesn’t install them.

benchmarks/test_io_benchmarks.py (1)

21-24: Network-bound data fetches – ensure CI determinism and cost control.

fetch() may download registry files. Confirm CI caches these (or pre-populates) and that non-CodSpeed test jobs don’t run these benches by default.

If not already done, exclude benchmarks from normal test runs:

  • Add to pytest config:
    • markers = ["benchmark: CodSpeed benchmarks"]
    • testpaths = ["tests"]
    • norecursedirs = ["benchmarks"]
  • Or run benches via a dedicated workflow only.
benchmarks/test_patch_benchmarks.py (1)

116-124: Good: matplotlib cleanup avoids resource leaks in CI.

benchmarks/test_spool_benchmarks.py (1)

48-53: LGTM: Negative case ensures no accidental merging on gaps.

Comment on lines 1 to 8
{
"creator": {
"name": "pytest-codspeed",
"version": "4.0.0",
"pid": 13200
},
"python": {
"sysconfig": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove CodSpeed results artifact from source control (PII + churn).

This JSON captures local absolute paths (/home/derrick/…), env details, and machine-specific timings. Keeping it in git risks privacy leakage and noisy diffs. Prefer uploading as CI artifacts only and ignoring in VCS.

Apply the following to ignore CodSpeed result files:

+## Ignore CodSpeed local results
+.codspeed/
+benchmarks/.codspeed/

If you want an example file in docs, redact paths and reduce fields, and place it under docs/ instead.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In .codspeed/results_1758700287344.json lines 1-8: this file contains local
machine paths and env/timing data and should be removed from source control;
delete the tracked file from the repo (git rm --cached
.codspeed/results_1758700287344.json or remove the entire .codspeed/ directory
from tracking), add a .gitignore entry to ignore .codspeed/ or results_*.json so
future runs aren’t committed, commit the removal and ignore change, and adjust
CI to upload CodSpeed output as build artifacts instead of committing; if you
need a reference sample, add a redacted, minimal example under docs/ with
personal paths and sensitive fields removed.

@github-actions
Copy link

github-actions bot commented Sep 24, 2025

✅ Documentation built:
👉 Download
Note: You must be logged in to github and a DASDAE member to access the link.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d5077c and 0e298ff.

📒 Files selected for processing (5)
  • .github/test_code.sh (1 hunks)
  • .github/workflows/codspeed.yml (1 hunks)
  • benchmarks/test_io_benchmarks.py (1 hunks)
  • benchmarks/test_patch_benchmarks.py (1 hunks)
  • benchmarks/test_spool_benchmarks.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • .github/workflows/codspeed.yml
  • benchmarks/test_patch_benchmarks.py
🧰 Additional context used
🧬 Code graph analysis (2)
benchmarks/test_io_benchmarks.py (2)
dascore/utils/downloader.py (2)
  • fetch (40-58)
  • get_registry_df (28-36)
dascore/io/core.py (1)
  • scan_to_df (654-702)
benchmarks/test_spool_benchmarks.py (1)
dascore/examples.py (1)
  • get_example_spool (710-756)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_build_docs
  • GitHub Check: Run benchmarks
🔇 Additional comments (2)
.github/test_code.sh (1)

8-10: Benchmarks path handling looks good.

Nice touch adding the explicit benchmarks --codspeed branch—this lines up neatly with the new CodSpeed workflow and keeps the test script single-entry-point friendly.

benchmarks/test_io_benchmarks.py (1)

13-24: Nice reuse of cached registry paths.

Leveraging @cache with the session-scoped fixture keeps the benchmark warm without re-fetching data on every test run. Looks great.

Comment on lines +41 to +107
@pytest.mark.benchmark
def test_contiguous_merge_performance(self, spool_no_gap):
"""Time merging contiguous patches from in-memory spool."""
_chunk_and_check(spool_no_gap)

@pytest.mark.benchmark
def test_no_overlap_merge_performance(self, gapped_spool_no_overlap):
"""Timing for trying to chunk patches that have no overlap."""
chunked = gapped_spool_no_overlap.chunk(time=None)
# In this case the spool should not be merged.
assert len(chunked) == len(gapped_spool_no_overlap)

@pytest.mark.benchmark
def test_diverse_merge_performance(self, diverse_spool):
"""Time trying to merge the diverse spool."""
_chunk_and_check(diverse_spool, length=None)

@pytest.mark.benchmark
def test_1second_chunk_performance(self, spool_no_gap):
"""Time chunking for one second along no gap spool."""
_chunk_and_check(spool_no_gap, time=1, length=None)

@pytest.mark.benchmark
def test_half_second_chunk_performance(self, spool_no_gap):
"""Time chunking for 0.5 along no gap spool."""
_chunk_and_check(spool_no_gap, time=0.5, length=None)


class TestSelectionBenchmarks:
"""Benchmarks for spool selection operations."""

@pytest.fixture
def spool_no_gap_df(self, spool_no_gap):
"""Get contents dataframe for no gap spool."""
return spool_no_gap.get_contents()

@pytest.mark.benchmark
def test_select_full_range_performance(self, spool_no_gap, spool_no_gap_df):
"""Timing selecting the full time range."""
df, spool = spool_no_gap_df, spool_no_gap
t1, t2 = df["time_min"].min(), df["time_max"].max()
spool.select(time=(t1, t2))
spool.select(time=(None, t2))
spool.select(time=(t1, None))

@pytest.mark.benchmark
def test_select_half_range_performance(self, spool_no_gap, spool_no_gap_df):
"""Time selecting and trimming."""
df, spool = spool_no_gap_df, spool_no_gap
t1, t2 = df["time_min"].min(), df["time_max"].max()
duration = (t2 - t1) / 2
spool.select(time=(t1, t2 - duration))
spool.select(time=(t1 + duration, t2))

@pytest.mark.benchmark
def test_select_strings_performance(self, diverse_spool):
"""Time select non-dimensional selects."""
spool = diverse_spool
spool.select(tag="some_tag")
spool.select(station="wayout")

@pytest.mark.benchmark
def test_select_string_match_performance(self, diverse_spool):
"""Time select non-dimensional selects with wildcards."""
spool = diverse_spool
spool.select(tag="some_*")
spool.select(station="wayou?")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wire the tests through the benchmark fixture

pytest-codspeed mirrors pytest-benchmark: a test marked with @pytest.mark.benchmark only becomes a benchmark when it drives the workload via the benchmark fixture. As written, each test body executes exactly once as an ordinary pytest test, so CodSpeed will report empty series instead of timing data. Please route every workload through the fixture (chunking and selection alike) so CodSpeed can sample and publish the measurements.

Here is an example adjustment for the chunking tests; similar changes are needed for the selection benchmarks:

@@
-    def test_contiguous_merge_performance(self, spool_no_gap):
-        _chunk_and_check(spool_no_gap)
+    def test_contiguous_merge_performance(self, spool_no_gap, benchmark):
+        benchmark(_chunk_and_check, spool_no_gap)
@@
-    def test_no_overlap_merge_performance(self, gapped_spool_no_overlap):
-        chunked = gapped_spool_no_overlap.chunk(time=None)
-        # In this case the spool should not be merged.
-        assert len(chunked) == len(gapped_spool_no_overlap)
+    def test_no_overlap_merge_performance(self, gapped_spool_no_overlap, benchmark):
+        def run():
+            chunked = gapped_spool_no_overlap.chunk(time=None)
+            assert len(chunked) == len(gapped_spool_no_overlap)
+        benchmark(run)
@@
-    def test_diverse_merge_performance(self, diverse_spool):
-        _chunk_and_check(diverse_spool, length=None)
+    def test_diverse_merge_performance(self, diverse_spool, benchmark):
+        benchmark(_chunk_and_check, diverse_spool, length=None)
@@
-    def test_1second_chunk_performance(self, spool_no_gap):
-        _chunk_and_check(spool_no_gap, time=1, length=None)
+    def test_1second_chunk_performance(self, spool_no_gap, benchmark):
+        benchmark(_chunk_and_check, spool_no_gap, time=1, length=None)
@@
-    def test_half_second_chunk_performance(self, spool_no_gap):
-        _chunk_and_check(spool_no_gap, time=0.5, length=None)
+    def test_half_second_chunk_performance(self, spool_no_gap, benchmark):
+        benchmark(_chunk_and_check, spool_no_gap, time=0.5, length=None)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.mark.benchmark
def test_contiguous_merge_performance(self, spool_no_gap):
"""Time merging contiguous patches from in-memory spool."""
_chunk_and_check(spool_no_gap)
@pytest.mark.benchmark
def test_no_overlap_merge_performance(self, gapped_spool_no_overlap):
"""Timing for trying to chunk patches that have no overlap."""
chunked = gapped_spool_no_overlap.chunk(time=None)
# In this case the spool should not be merged.
assert len(chunked) == len(gapped_spool_no_overlap)
@pytest.mark.benchmark
def test_diverse_merge_performance(self, diverse_spool):
"""Time trying to merge the diverse spool."""
_chunk_and_check(diverse_spool, length=None)
@pytest.mark.benchmark
def test_1second_chunk_performance(self, spool_no_gap):
"""Time chunking for one second along no gap spool."""
_chunk_and_check(spool_no_gap, time=1, length=None)
@pytest.mark.benchmark
def test_half_second_chunk_performance(self, spool_no_gap):
"""Time chunking for 0.5 along no gap spool."""
_chunk_and_check(spool_no_gap, time=0.5, length=None)
class TestSelectionBenchmarks:
"""Benchmarks for spool selection operations."""
@pytest.fixture
def spool_no_gap_df(self, spool_no_gap):
"""Get contents dataframe for no gap spool."""
return spool_no_gap.get_contents()
@pytest.mark.benchmark
def test_select_full_range_performance(self, spool_no_gap, spool_no_gap_df):
"""Timing selecting the full time range."""
df, spool = spool_no_gap_df, spool_no_gap
t1, t2 = df["time_min"].min(), df["time_max"].max()
spool.select(time=(t1, t2))
spool.select(time=(None, t2))
spool.select(time=(t1, None))
@pytest.mark.benchmark
def test_select_half_range_performance(self, spool_no_gap, spool_no_gap_df):
"""Time selecting and trimming."""
df, spool = spool_no_gap_df, spool_no_gap
t1, t2 = df["time_min"].min(), df["time_max"].max()
duration = (t2 - t1) / 2
spool.select(time=(t1, t2 - duration))
spool.select(time=(t1 + duration, t2))
@pytest.mark.benchmark
def test_select_strings_performance(self, diverse_spool):
"""Time select non-dimensional selects."""
spool = diverse_spool
spool.select(tag="some_tag")
spool.select(station="wayout")
@pytest.mark.benchmark
def test_select_string_match_performance(self, diverse_spool):
"""Time select non-dimensional selects with wildcards."""
spool = diverse_spool
spool.select(tag="some_*")
spool.select(station="wayou?")
@pytest.mark.benchmark
def test_contiguous_merge_performance(self, spool_no_gap, benchmark):
"""Time merging contiguous patches from in-memory spool."""
benchmark(_chunk_and_check, spool_no_gap)
@pytest.mark.benchmark
def test_no_overlap_merge_performance(self, gapped_spool_no_overlap, benchmark):
"""Timing for trying to chunk patches that have no overlap."""
def run():
chunked = gapped_spool_no_overlap.chunk(time=None)
assert len(chunked) == len(gapped_spool_no_overlap)
benchmark(run)
@pytest.mark.benchmark
def test_diverse_merge_performance(self, diverse_spool, benchmark):
"""Time trying to merge the diverse spool."""
benchmark(_chunk_and_check, diverse_spool, length=None)
@pytest.mark.benchmark
def test_1second_chunk_performance(self, spool_no_gap, benchmark):
"""Time chunking for one second along no gap spool."""
benchmark(_chunk_and_check, spool_no_gap, time=1, length=None)
@pytest.mark.benchmark
def test_half_second_chunk_performance(self, spool_no_gap, benchmark):
"""Time chunking for 0.5 along no gap spool."""
benchmark(_chunk_and_check, spool_no_gap, time=0.5, length=None)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e298ff and f06458f.

📒 Files selected for processing (1)
  • .github/workflows/codspeed.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.7)
.github/workflows/codspeed.yml

23-23: the runner of "actions/setup-python@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_build_docs
  • GitHub Check: Run benchmarks

@d-chambers d-chambers removed documentation Improvements or additions to documentation ready_for_review PR is ready for review labels Sep 25, 2025
@coderabbitai coderabbitai bot added documentation Improvements or additions to documentation ready_for_review PR is ready for review labels Sep 25, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
.github/test_code.sh (1)

4-10: Preserve passthrough flags after selecting a mode.

With the new branching, any extra pytest options past the first positional argument are silently dropped (./test_code.sh benchmarks -k scan no longer works). Let's keep forwarding the remaining arguments after we pick the mode.

-args="tests -s --cov dascore --cov-append --cov-report=xml"
-if [[ "$1" == "doctest" ]]; then
-  args="dascore --doctest-modules"
-fi
-if [[ "$1" == "benchmarks" ]]; then
-  args="benchmarks --codspeed"
-fi
+args="tests -s --cov dascore --cov-append --cov-report=xml"
+case "$1" in
+  doctest)
+    args="dascore --doctest-modules"
+    shift
+    ;;
+  benchmarks)
+    args="benchmarks --codspeed"
+    shift
+    ;;
+esac
@@
-python -m pytest $args || exit_code=$?
+python -m pytest $args "$@" || exit_code=$?
benchmarks/test_patch_benchmarks.py (1)

12-16: Consider widening the fixture scope for stability.

dc.get_example_patch() runs before every benchmark iteration right now, so CodSpeed’s timing includes patch construction noise. Bumping this to a module/session fixture (and copying inside the test if mutation ever becomes a concern) would make the measurements more stable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f06458f and 36225b5.

📒 Files selected for processing (17)
  • .github/test_code.sh (1 hunks)
  • .github/workflows/codspeed.yml (1 hunks)
  • .gitignore (1 hunks)
  • asv.conf.json (0 hunks)
  • benchmarks/__init__.py (0 hunks)
  • benchmarks/generic_io_benchmarks.py (0 hunks)
  • benchmarks/memory_spool_benchmarks.py (0 hunks)
  • benchmarks/patch_benchmarks.py (0 hunks)
  • benchmarks/readme.md (1 hunks)
  • benchmarks/test_io_benchmarks.py (1 hunks)
  • benchmarks/test_patch_benchmarks.py (1 hunks)
  • benchmarks/test_spool_benchmarks.py (1 hunks)
  • pyproject.toml (1 hunks)
  • readme.md (1 hunks)
  • scripts/_benchmark_uilts.py (0 hunks)
  • scripts/run_benchmarks.py (0 hunks)
  • scripts/visualize_benchmarks.py (0 hunks)
💤 Files with no reviewable changes (8)
  • benchmarks/memory_spool_benchmarks.py
  • scripts/_benchmark_uilts.py
  • asv.conf.json
  • scripts/visualize_benchmarks.py
  • benchmarks/patch_benchmarks.py
  • scripts/run_benchmarks.py
  • benchmarks/generic_io_benchmarks.py
  • benchmarks/init.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • readme.md
  • benchmarks/test_spool_benchmarks.py
  • .gitignore
  • benchmarks/readme.md
  • pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (2)
benchmarks/test_io_benchmarks.py (2)
dascore/utils/downloader.py (2)
  • fetch (40-58)
  • get_registry_df (28-36)
dascore/io/core.py (1)
  • scan_to_df (654-702)
benchmarks/test_patch_benchmarks.py (2)
dascore/utils/patch.py (1)
  • get_start_stop_step (433-440)
dascore/examples.py (1)
  • get_example_patch (659-706)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: Run benchmarks
  • GitHub Check: lint_code
🔇 Additional comments (1)
benchmarks/test_io_benchmarks.py (1)

13-25: Nice job caching the registry lookups.

Memoizing the registry fetch keeps the benchmarks from redownloading data on every invocation—exactly what we need for CodSpeed runs.

Comment on lines 23 to 35
- uses: actions/setup-python@v6
with:
python-version: ${{ env.python_version }}

- name: Install dependencies
run: pip install ".[test]"

- name: Run benchmarks
uses: CodSpeedHQ/action@v4
with:
mode: instrumentation
run: pytest benchmarks/ --codspeed
token: ${{ secrets.CODSPEED_TOKEN }} # Optional for public repos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Use a published actions/setup-python release.

actions/setup-python@v6 hasn't been published, so the workflow will fail with “Failed to resolve action”. Switch to the latest shipping major (@v5) to keep the job runnable.

Apply this diff:

-      - uses: actions/setup-python@v6
+      - uses: actions/setup-python@v5
         with:
           python-version: ${{ env.python_version }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- uses: actions/setup-python@v6
with:
python-version: ${{ env.python_version }}
- name: Install dependencies
run: pip install ".[test]"
- name: Run benchmarks
uses: CodSpeedHQ/action@v4
with:
mode: instrumentation
run: pytest benchmarks/ --codspeed
token: ${{ secrets.CODSPEED_TOKEN }} # Optional for public repos
- uses: actions/setup-python@v5
with:
python-version: ${{ env.python_version }}
- name: Install dependencies
run: pip install ".[test]"
- name: Run benchmarks
uses: CodSpeedHQ/action@v4
with:
mode: instrumentation
run: pytest benchmarks/ --codspeed
token: ${{ secrets.CODSPEED_TOKEN }} # Optional for public repos
🤖 Prompt for AI Agents
.github/workflows/codspeed.yml around lines 23 to 35: the workflow references an
unpublished actions/setup-python@v6 causing "Failed to resolve action"; update
the action reference to a published release (for example
actions/setup-python@v5) so the runner can resolve it and the job runs; keep the
rest of the step unchanged.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/test_io/test_common_io.py (1)

125-130: Preserve the original exception details in skip reasons.

Catching both cases here is the right move, but the current skip message hides which condition actually triggered. Including the exception detail makes it much easier to diagnose flaky benchmarks without digging into logs.

Apply this diff:

-    except (MissingOptionalDependencyError, TimeoutError):
-        pytest.skip("Missing optional dep to read file or unable to fetch.")
+    except MissingOptionalDependencyError as exc:
+        pytest.skip(
+            f"Missing optional dependency required to read file: {exc}"
+        )
+    except TimeoutError as exc:
+        pytest.skip(f"Unable to fetch data due to timeout: {exc}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d232b9 and e461ac0.

📒 Files selected for processing (1)
  • tests/test_io/test_common_io.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_io/test_common_io.py (1)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: Run benchmarks
  • GitHub Check: test_build_docs
🔇 Additional comments (1)
tests/test_io/test_common_io.py (1)

110-112: New timeout-aware skip helper usage looks good.

Thanks for wiring the broader skip helper through the cached read path; this keeps the tests resilient to transient fetch hiccups.

@codspeed-hq
Copy link

codspeed-hq bot commented Sep 25, 2025

CodSpeed Performance Report

Congrats! CodSpeed is installed 🎉

🆕 49 new benchmarks were detected.

You will start to see performance impacts in the reports once the benchmarks are run from your default branch.

Detected benchmarks

  • test_get_format_performance (96.1 ms)
  • test_read_performance (15.7 s)
  • test_scan_df_performance (349.4 ms)
  • test_scan_performance (254 ms)
  • test_spool_performance (16.5 s)
  • test_first_performance (6 ms)
  • test_last_performance (6.2 ms)
  • test_max_performance (14.8 ms)
  • test_mean_performance (32.9 ms)
  • test_median_performance (141.8 ms)
  • test_min_performance (14.8 ms)
  • test_std_performance (48.1 ms)
  • test_sum_performance (29.3 ms)
  • test_decimate_performance (208.9 ms)
  • test_hampel_filter_performance (438.2 ms)
  • test_interpolate_performance (92.7 ms)
  • test_median_filter_performance (299.3 ms)
  • test_pass_filter_performance (251.5 ms)
  • test_roll_performance (13.4 ms)
  • test_select_performance (26.4 ms)
  • test_snap_coords_performance (13.2 ms)
  • test_sobel_filter_performance (29.1 ms)
  • test_standardize_performance (23.7 ms)
  • test_taper_performance (12.2 ms)
  • test_transpose_performance (4.1 ms)
  • test_wiener_filter_performance (431.1 ms)
  • test_rolling_large_roller_mean_performance (254.1 ms)
  • test_rolling_small_roller_mean_performance (62.7 ms)
  • test_definite_integrate_performance (33.8 ms)
  • test_dft_performance (94.8 ms)
  • test_differentiate_performance (33.7 ms)
  • test_envelope_performance (90.6 ms)
  • test_hilbert_performance (82.4 ms)
  • test_idft_performance (71.3 ms)
  • test_indefinite_integrate_performance (41 ms)
  • test_stft (62.3 ms)
  • test_repr_performance (5.1 ms)
  • test_str_performance (5.1 ms)
  • test_waterfall_performance (44.4 ms)
  • test_wiggle_performance (302.4 ms)
  • test_1second_chunk_performance (1.1 s)
  • test_contiguous_merge_performance (206.4 ms)
  • test_diverse_merge_performance (608.2 ms)
  • test_half_second_chunk_performance (2.1 s)
  • test_no_overlap_merge_performance (187 ms)
  • test_select_full_range_performance (57.7 ms)
  • test_select_half_range_performance (39.7 ms)
  • test_select_string_match_performance (20.4 ms)
  • test_select_strings_performance (20.4 ms)

@d-chambers
Copy link
Contributor Author

d-chambers commented Sep 25, 2025

Finally, the CI seems to be working. Before merging this we need to:

  • Review the benchmark files and decide if we need to add more
  • Double check the pyproject.toml and ensure the different name spaces all make sense.
  • Update the docs about profiling. Add something about adding a benchmark when adding new features.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
pyproject.toml (1)

86-89: Pin benchmark/profiling plugin versions and include snakeviz (docs alignment).

To reduce CI drift and match the docs’ “view with snakeviz” guidance, pin plugin versions and include snakeviz in the profiling stack.

Apply this diff:

 profile_base = [
-    "pytest-codspeed",
-    "pytest-profiling",
+    "pytest-codspeed>=2,<3",
+    "pytest-profiling>=1,<2",
+    "snakeviz",
 ]

Optionally, narrow to tested patch versions once CI confirms stability. Based on learnings

docs/contributing/profiling_benchmarks.qmd (2)

10-11: Scope the run command to only benchmark tests.

Recommend filtering to the benchmark suite and marker to avoid running non-benchmark tests locally when using CodSpeed.

Suggested command line:

python -m pytest benchmarks -m benchmark --codspeed

24-26: Clarify STFT profiling example path and output.

Minor wording tweak to reflect the specific benchmark path and that multiple .prof files may be produced.

Apply this diff:

-pytest benchmarks/test_patch_benchmarks.py::TestProcessingBenchmarks::test_pass_filter_performance --profile
+pytest benchmarks/test_patch_benchmarks.py::TestProcessingBenchmarks::test_pass_filter_performance --profile
+# This creates a `prof/` folder with files like `test_pass_filter_performance.prof`.
benchmarks/test_patch_benchmarks.py (4)

20-26: Fix typo in variable name (readability).

Rename new_vales → new_values.

-    rand = np.random.RandomState(39)
-    new_vales = rand.random(len(time))
-    return patch.update_coords(time=new_vales)
+    rand = np.random.RandomState(39)
+    new_values = rand.random(len(time))
+    return patch.update_coords(time=new_values)

Optionally, prefer the modern RNG:

rng = np.random.default_rng(39)
new_values = rng.random(len(time))

136-140: Standardize transform call style for consistency.

Elsewhere you call transforms directly on the patch (e.g., example_patch.dft(dim="time")). Use the same here for clarity.

-        return example_patch.tran.dft("time")
+        return example_patch.dft(dim="time")

189-196: Remove redundant import; reuse module-level plt import.

Avoid re-importing matplotlib inside the benchmark to keep timing focused on the operation.

-        # Clean up matplotlib figures
-        import matplotlib.pyplot as plt
-
-        plt.close("all")
+        # Clean up matplotlib figures
+        plt.close("all")

208-214: Close figures without re-import (consistency).

Same clean-up suggestion for wiggle benchmark.

-        # Clean up matplotlib figures
-        plt.close("all")
+        # Clean up matplotlib figures
+        plt.close("all")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd6b955 and 792e24e.

📒 Files selected for processing (6)
  • .gitignore (2 hunks)
  • benchmarks/test_patch_benchmarks.py (1 hunks)
  • docs/contributing/profiling_benchmarks.qmd (1 hunks)
  • docs/contributors.qmd (0 hunks)
  • pyproject.toml (1 hunks)
  • readme.md (2 hunks)
💤 Files with no reviewable changes (1)
  • docs/contributors.qmd
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (1)
  • readme.md
🧰 Additional context used
🧬 Code graph analysis (1)
benchmarks/test_patch_benchmarks.py (7)
dascore/utils/patch.py (1)
  • get_start_stop_step (433-440)
dascore/examples.py (1)
  • get_example_patch (659-706)
dascore/proc/coords.py (2)
  • update_coords (223-248)
  • snap_coords (20-53)
dascore/proc/resample.py (1)
  • interpolate (96-153)
dascore/viz/waterfall.py (1)
  • waterfall (41-116)
dascore/viz/wiggle.py (1)
  • wiggle (54-119)
dascore/proc/rolling.py (1)
  • rolling (213-363)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_build_docs
  • GitHub Check: Run benchmarks
🔇 Additional comments (1)
pyproject.toml (1)

93-94: Verify self-referential extras composition works across installers.

Using self-referential extras (e.g., profile = ["dascore[test]", "dascore[profile_base]"]) is common but can be fragile with older tooling. Please verify this installs cleanly with the pip version used in CI and for contributors.

If needed, an alternative is to expand extras explicitly:

# Alternative if self-references cause issues:
profile = [
  "coverage>=7.4,<8",
  "pytest-cov>=4",
  "pre-commit",
  "pytest",
  "pytest-codeblocks",
  "twine",
  "pytest-codspeed>=2,<3",
  "pytest-profiling>=1,<2",
  "snakeviz",
]

Also consider registering the benchmark marker to avoid pytest “unknown marker” warnings for local dev:

[tool.pytest.ini_options]
markers = [
  "benchmark: marks tests as performance benchmarks to be run with --codspeed"
]

Based on learnings

@d-chambers d-chambers removed documentation Improvements or additions to documentation ready_for_review PR is ready for review labels Sep 26, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/test_io/test_common_io.py (1)

124-135: Undefined skip_missing_or_timeout breaks every test entry point

All the with statements above now reference skip_missing_or_timeout(), but the only context manager defined here is still skip_missing_dependency(). At runtime this will raise a NameError, and Ruff is already flagging it. Please rename the function (and keep a backward-compatible alias) so the new symbol actually exists.

-@contextmanager
-def skip_missing_dependency():
-    """Skip if missing dependencies found."""
+@contextmanager
+def skip_missing_or_timeout():
+    """Skip if missing dependencies or timeouts are encountered."""
     try:
         yield
     except MissingOptionalDependencyError as exc:
         pytest.skip(
             f"Missing optional dependency required to read file: {exc}"
         )
     except TimeoutError as exc:
         pytest.skip(f"Unable to fetch data due to timeout: {exc}")
+
+
+# Backward compatibility
+skip_missing_dependency = skip_missing_or_timeout
🧹 Nitpick comments (6)
benchmarks/test_patch_benchmarks.py (6)

174-178: Rename test to match naming convention and fix docstring duplication.

Use the _performance suffix like the rest and remove the duplicated “transform”.

-    def test_stft(self, example_patch):
-        """Time short time fourier transform transform."""
+    def test_stft_performance(self, example_patch):
+        """Time short-time Fourier transform (STFT)."""

29-34: Make matplotlib cleanup autouse to avoid Ruff ARG002 and ensure cleanup always runs.

This removes the need to thread the fixture into tests while guaranteeing figures are closed.

-@pytest.fixture()
-def cleanup_mpl():
+@pytest.fixture(scope="function", autouse=True)
+def cleanup_mpl():
     """Close all open matplotlib figures after test."""
     yield
     plt.close("all")

196-196: Drop unused cleanup_mpl parameter (handled by autouse).

Silences Ruff ARG002 and simplifies the signature.

-    def test_waterfall_performance(self, example_patch, cleanup_mpl):
+    def test_waterfall_performance(self, example_patch):

211-211: Drop unused cleanup_mpl parameter (handled by autouse).

Same as above.

-    def test_wiggle_performance(self, example_patch, cleanup_mpl):
+    def test_wiggle_performance(self, example_patch):

20-26: Use modern RNG API and fix variable name typo.

np.random.RandomState is legacy; prefer default_rng. Also fix new_valesnew_values.

-    patch = dc.get_example_patch()
-    time = patch.get_coord("time")
-    rand = np.random.RandomState(39)
-    new_vales = rand.random(len(time))
-    return patch.update_coords(time=new_vales)
+    patch = dc.get_example_patch()
+    time = patch.get_coord("time")
+    rng = np.random.default_rng(39)
+    new_values = rng.random(len(time))
+    return patch.update_coords(time=new_values)

60-66: Confirm dtype handling for time interpolation step.

If start/stop are datetime64 and step is timedelta64, step / 2 may yield a float and break np.arange for datetimes. Consider integer division (step // 2) for timedeltas or converting to float seconds consistently before constructing the target axis.

Option A (timedelta64):

-        patch.interpolate(time=np.arange(start, stop, step / 2))
+        patch.interpolate(time=np.arange(start, stop, step // 2))

Option B (floats): convert to float seconds first, then rebuild as datetimes (only if consistent with interpolate API).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 792e24e and 7791360.

📒 Files selected for processing (3)
  • benchmarks/test_patch_benchmarks.py (1 hunks)
  • docs/contributing/profiling_benchmarks.qmd (1 hunks)
  • tests/test_io/test_common_io.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/test_io/test_common_io.py (1)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
benchmarks/test_patch_benchmarks.py (3)
dascore/utils/patch.py (1)
  • get_start_stop_step (433-440)
dascore/examples.py (1)
  • get_example_patch (659-706)
dascore/proc/coords.py (2)
  • update_coords (223-248)
  • snap_coords (20-53)
🪛 Ruff (0.13.1)
tests/test_io/test_common_io.py

110-110: Undefined name skip_missing_or_timeout

(F821)


166-166: Undefined name skip_missing_or_timeout

(F821)


174-174: Undefined name skip_missing_or_timeout

(F821)


300-300: Undefined name skip_missing_or_timeout

(F821)


348-348: Undefined name skip_missing_or_timeout

(F821)


359-359: Undefined name skip_missing_or_timeout

(F821)


429-429: Undefined name skip_missing_or_timeout

(F821)

benchmarks/test_patch_benchmarks.py

196-196: Unused method argument: cleanup_mpl

(ARG002)


211-211: Unused method argument: cleanup_mpl

(ARG002)

🪛 GitHub Actions: LintCode
tests/test_io/test_common_io.py

[error] 110-110: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 166-166: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 174-174: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 300-300: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 348-348: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 359-359: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 429-429: F821 Undefined name skip_missing_or_timeout. (ruff)


[error] 1-1: ruff-format: 1 file reformatted; please run formatter again. Command: ruff-format

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: Run benchmarks
  • GitHub Check: test_build_docs
🔇 Additional comments (1)
docs/contributing/profiling_benchmarks.qmd (1)

6-30: Clear transition to CodSpeed workflows

Nice job updating the benchmarks/profiling guidance—commands and narrative now align with the pytest+CodSpeed tooling and the profiling section reads smoothly.

@coderabbitai coderabbitai bot added documentation Improvements or additions to documentation ready_for_review PR is ready for review labels Sep 26, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/test_io/test_common_io.py (2)

283-289: Bug: incorrect path used when reading into BytesIO.

Path() is given the tuple fixture instead of the path string, causing a TypeError.

Apply this diff:

         bio = BytesIO()
-        bio.write(Path(io_path_tuple).read_bytes())
+        bio.write(Path(path).read_bytes())
         bio.seek(0)

435-443: Fix vacuous assertions; compare patch attrs to scan attrs.

These assertions currently compare pat_attrs1 to itself, masking mismatches.

Apply this diff:

             for dim in pat_attrs1.dim_tuple:
-                assert getattr(pat_attrs1, f"{dim}_min") == getattr(
-                    pat_attrs1, f"{dim}_min"
-                )
+                assert getattr(pat_attrs1, f"{dim}_min") == getattr(
+                    scan_attrs2, f"{dim}_min"
+                )
                 for dim_attr in DIM_RELATED_ATTRS:
                     attr_name = dim_attr.format(dim=dim)
                     attr1 = getattr(pat_attrs1, attr_name)
-                    attr2 = getattr(pat_attrs1, attr_name)
+                    attr2 = getattr(scan_attrs2, attr_name)
                     assert attr1 == attr2
🧹 Nitpick comments (4)
tests/test_io/test_common_io.py (4)

100-109: Clarify docstring and generalize timeout skip message.

This context manager also skips on timeouts and is used around read/scan, not only fetch. Update wording to avoid confusion and make the timeout message generic.

Apply this diff:

 def skip_missing_or_timeout():
-    """Skip if missing dependencies found."""
+    """Skip test if an optional dependency is missing or the operation times out."""
@@
-    except TimeoutError as exc:
-        pytest.skip(f"Unable to fetch data due to timeout: {exc}")
+    except TimeoutError as exc:
+        pytest.skip(f"Operation timed out: {exc}")

147-149: Also wrap fetch() to actually catch timeouts.

Currently timeouts during fetch in this fixture won’t be skipped. Wrap the fetch call to align with the new helper’s intent.

Apply this diff:

     io, fetch_name = request.param
-    return io, fetch(fetch_name)
+    with skip_missing_or_timeout():
+        path = fetch(fetch_name)
+    return io, path

156-159: Wrap fetch() in data_file_path to skip on timeouts.

Without this, network timeouts during data acquisition will fail the test session.

Apply this diff:

     if str(param) in SKIP_DATA_FILES:
         pytest.skip(f"Skipping {param}")
-    return fetch(request.param)
+    with skip_missing_or_timeout():
+        return fetch(request.param)

243-249: Guard fetch() in loop with skip helper.

A single fetch timeout here will fail the test; using the helper will skip cleanly.

Apply this diff:

         for key in data_files:
-            path = fetch(key)
+            with skip_missing_or_timeout():
+                path = fetch(key)
             out = io_instance.get_format(path)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7791360 and 4bb2b29.

📒 Files selected for processing (1)
  • tests/test_io/test_common_io.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_io/test_common_io.py (1)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: Run benchmarks
🔇 Additional comments (8)
tests/test_io/test_common_io.py (8)

121-123: LGTM: cached read now skips on missing deps/timeouts.

Wrapping read calls here is appropriate and consistent with the new helper.


164-166: LGTM: read_spool guarded by skip helper.

Good consistency with the new context manager.


172-174: LGTM: scanned_attrs guarded by skip helper.

Consistent with read_spool.


298-300: LGTM: scan guarded by skip helper in slicing test.

Matches the pattern used elsewhere.


346-348: LGTM: basic scan wrapped with skip helper.

Good defensive handling for optional deps/timeouts.


357-359: LGTM: io.scan wrapped with skip helper.

Consistent usage across tests.


427-429: LGTM: integration scan wrapped with skip helper.

Keeps the integration test resilient to env variability.


100-109: No remaining uses of skip_missing_dependency
Confirmed all references were replaced via ripgrep search.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/test_io/test_common_io.py (2)

309-351: Wrap direct io.read calls with skip_missing to avoid hard failures on optional deps.

This test calls io.read three times without the skip; formats with optional deps will error instead of skipping.

Apply this diff:

         # first test double ended query
         trim_tuple = (start + duration / 10, start + 2 * duration / 10)
-        spool = io.read(path, **{dim: trim_tuple})
+        with skip_missing():
+            spool = io.read(path, **{dim: trim_tuple})
         assert len(spool) == 1
         patch = spool[0]
         _assert_coords_attrs_match(patch)
         coord = patch.get_coord(dim)
         _assert_op_or_close(coord.min(), trim_tuple[0], ge)
         _assert_op_or_close(coord.max(), trim_tuple[1], le)
         # then single-ended query on start side
         trim_tuple = (start + duration / 10, ...)
-        spool = io.read(path, **{dim: trim_tuple})
+        with skip_missing():
+            spool = io.read(path, **{dim: trim_tuple})
         assert len(spool) == 1
         patch = spool[0]
         attrs = patch.attrs
         _assert_coords_attrs_match(patch)
         coord = patch.get_coord(dim)
         _assert_op_or_close(coord.min(), trim_tuple[0], ge)
         _assert_op_or_close(coord.max(), getattr(attrs, f"{dim}_max"), eq)
         # then single-ended query on end side
         trim_tuple = (None, start + duration / 10)
-        spool = io.read(path, **{dim: trim_tuple})
+        with skip_missing():
+            spool = io.read(path, **{dim: trim_tuple})

439-460: Fix self-comparison bug: scan/patch attribute checks compare pat_attrs1 to itself.

Current assertions are tautologies; compare patch vs scan attributes.

Apply this diff:

         for pat_attrs1, scan_attrs2 in zip(patch_attrs_list, scan_attrs_list):
             assert pat_attrs1.dims == scan_attrs2.dims
             # first compare dimensions are related attributes
             for dim in pat_attrs1.dim_tuple:
-                assert getattr(pat_attrs1, f"{dim}_min") == getattr(
-                    pat_attrs1, f"{dim}_min"
-                )
+                assert getattr(pat_attrs1, f"{dim}_min") == getattr(
+                    scan_attrs2, f"{dim}_min"
+                )
                 for dim_attr in DIM_RELATED_ATTRS:
                     attr_name = dim_attr.format(dim=dim)
-                    attr1 = getattr(pat_attrs1, attr_name)
-                    attr2 = getattr(pat_attrs1, attr_name)
+                    attr1 = getattr(pat_attrs1, attr_name)
+                    attr2 = getattr(scan_attrs2, attr_name)
                     assert attr1 == attr2
             # then other expected attributes.
             for attr_name in comp_attrs:
                 patch_attr = getattr(pat_attrs1, attr_name)
                 scan_attr = getattr(scan_attrs2, attr_name)
                 assert scan_attr == patch_attr
🧹 Nitpick comments (4)
benchmarks/test_patch_benchmarks.py (3)

29-33: Consider removing unused cleanup fixture or documenting its purpose.

The static analysis correctly identifies that cleanup_mpl is unused in the methods that accept it as a parameter. The fixture itself works correctly, but the parameter appears unused.

If the fixture is working as intended through pytest's automatic invocation, consider adding a comment to clarify this:

 @pytest.fixture()
 def cleanup_mpl():
-    """Close all open matplotlib figures after test."""
+    """Close all open matplotlib figures after test.
+    
+    Note: This fixture is automatically invoked by pytest for tests
+    that include it as a parameter, even if the parameter isn't used.
+    """
     yield
     plt.close("all")

Alternatively, if the parameter isn't needed, remove it from the affected test methods.


177-181: Update test method name to match operation.

The method name should reflect that it's testing STFT, not a generic transform.

-    def test_stft(self, example_patch):
+    def test_stft_performance(self, example_patch):

199-202: Address unused cleanup_mpl parameter.

Static analysis correctly identifies that the cleanup_mpl parameter is unused in these methods, even though the fixture itself works.

Based on learnings about pytest fixtures, the fixture is likely working through pytest's automatic invocation. Either add a comment to clarify this or consider if the parameter is truly needed:

-    def test_waterfall_performance(self, example_patch, cleanup_mpl):
+    def test_waterfall_performance(self, example_patch, cleanup_mpl):
         """Timing for waterfall patch."""
+        # cleanup_mpl fixture automatically closes matplotlib figures
         example_patch.viz.waterfall()

Also applies to: 214-218

tests/test_io/test_common_io.py (1)

100-109: Don’t catch TimeoutError in skip_missing (let skip_timeout handle it).

Catching TimeoutError here duplicates skip_timeout and emits a misleading “fetch” message during read/scan failures.

Apply this diff:

 @contextmanager
 def skip_missing():
     """Skip if missing dependencies found."""
     try:
         yield
     except MissingOptionalDependencyError as exc:
         pytest.skip(f"Missing optional dependency required to read file: {exc}")
-    except TimeoutError as exc:
-        pytest.skip(f"Unable to fetch data due to timeout: {exc}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4bb2b29 and b410675.

📒 Files selected for processing (2)
  • benchmarks/test_patch_benchmarks.py (1 hunks)
  • tests/test_io/test_common_io.py (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
benchmarks/test_patch_benchmarks.py (4)
dascore/utils/patch.py (1)
  • get_start_stop_step (433-440)
dascore/examples.py (1)
  • get_example_patch (659-706)
dascore/proc/coords.py (2)
  • update_coords (223-248)
  • snap_coords (20-53)
dascore/utils/time.py (2)
  • to_timedelta64 (141-173)
  • to_float (327-333)
tests/test_io/test_common_io.py (2)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
dascore/utils/downloader.py (1)
  • fetch (40-58)
🪛 Ruff (0.13.1)
benchmarks/test_patch_benchmarks.py

200-200: Unused method argument: cleanup_mpl

(ARG002)


215-215: Unused method argument: cleanup_mpl

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_build_docs
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: Run benchmarks
🔇 Additional comments (28)
benchmarks/test_patch_benchmarks.py (19)

1-11: LGTM! Clean imports and module setup.

The module follows good practices with future annotations, proper imports, and clear docstring indicating pytest-codspeed usage.


13-16: LGTM! Well-designed fixture for shared patch data.

Module-scoped fixture is appropriate for read-only benchmark data that can be reused across tests.


39-47: LGTM! Efficient interpolation setup.

Good use of a fixture to pre-compute interpolation arrays, preventing timing contamination in the benchmark.


49-56: LGTM! Comprehensive pass filter benchmarking.

Tests various filter configurations including edge cases with None boundaries.


58-63: LGTM! Covers key median filter scenarios.

Tests both single and multi-dimensional filtering with samples parameter.


65-69: LGTM! Efficient interpolation benchmarking.

Proper use of the pre-computed interp_time fixture prevents setup cost from affecting benchmark timing.


71-78: LGTM! Comprehensive decimation testing.

Tests different decimation factors and filter types including the no-filter case.


80-89: LGTM! Thorough selection benchmarking.

Tests various selection patterns including open-ended ranges and computed time boundaries.


91-95: LGTM! Clean single-operation benchmarks.

These benchmarks efficiently test individual operations: Sobel filter, standardization, tapering, transpose, and roll operations.

Also applies to: 97-101, 103-107, 109-114, 116-120


122-126: LGTM! Good use of specialized fixture.

Appropriately uses the patch_uneven_time fixture to test coordinate snapping on non-uniform data.


128-135: LGTM! Comprehensive filter benchmarks.

Tests Hampel and Wiener filters with various parameter configurations including multi-dimensional and separable options.

Also applies to: 137-141


147-150: LGTM! Efficient DFT fixture setup.

Pre-computes DFT to avoid setup overhead in inverse DFT benchmarking.


152-165: LGTM! Complete integration and transform benchmarks.

Covers both definite/indefinite integration, differentiation, and forward/inverse DFT operations efficiently.

Also applies to: 167-175


183-193: LGTM! Clean transform benchmarks.

Both Hilbert transform and envelope calculation benchmarks are well-structured.


204-212: LGTM! String representation benchmarks.

Both str() and repr() benchmarks are appropriate for testing text rendering performance.


217-217: LGTM! Smart performance optimization.

Good practice to subset the data for visualization benchmarks to focus on rendering performance rather than data processing.


224-276: LGTM! Comprehensive aggregation benchmarks.

All aggregation operations (mean, max, min, std, sum, median, first, last) are properly benchmarked across both time and distance dimensions where applicable.


282-293: LGTM! Well-designed rolling window fixtures.

Both small and large roller fixtures are efficiently implemented with appropriate scoping and different sizing strategies.


295-303: LGTM! Complete rolling operation benchmarks.

Tests both small and large rolling windows to capture different performance characteristics.

tests/test_io/test_common_io.py (9)

130-132: LGTM: cache read now skips cleanly on missing deps.


157-159: LGTM: wrapping fetch in skip_timeout is appropriate.


168-170: LGTM: consistent timeout skip for data_file_path fetch.


175-177: LGTM: skip_missing around dc.read in fixture.


183-185: LGTM: skip_missing around dc.scan in fixture.


258-260: LGTM: timeout-safe fetch in format exclusion loop.


358-361: LGTM: scan wrapped with skip_missing.


369-374: LGTM: io.scan wrapped with skip_missing.


100-118: skip_missing_dependency removal confirmed

No occurrences of skip_missing_dependency() found across the codebase.

Comment on lines +19 to +26
@pytest.fixture(scope="module")
def patch_uneven_time():
"""Get a patch with uneven time coord."""
patch = dc.get_example_patch()
time = patch.get_coord("time")
rand = np.random.RandomState(39)
new_vales = rand.random(len(time))
return patch.update_coords(time=new_vales)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in variable name.

There's a typo in the variable name on line 25.

-    new_vales = rand.random(len(time))
+    new_values = rand.random(len(time))
-    return patch.update_coords(time=new_vales)
+    return patch.update_coords(time=new_values)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.fixture(scope="module")
def patch_uneven_time():
"""Get a patch with uneven time coord."""
patch = dc.get_example_patch()
time = patch.get_coord("time")
rand = np.random.RandomState(39)
new_vales = rand.random(len(time))
return patch.update_coords(time=new_vales)
@pytest.fixture(scope="module")
def patch_uneven_time():
"""Get a patch with uneven time coord."""
patch = dc.get_example_patch()
time = patch.get_coord("time")
rand = np.random.RandomState(39)
new_values = rand.random(len(time))
return patch.update_coords(time=new_values)
🤖 Prompt for AI Agents
In benchmarks/test_patch_benchmarks.py around lines 19 to 26, there's a typo:
the local variable is named "new_vales" on line 25. Rename it to a correctly
spelled identifier like "new_values" (and update any use of the variable on the
return line to match), ensuring you preserve the same length/shape logic
(rand.random(len(time))) and return patch.update_coords(time=new_values).

Comment on lines +178 to +181
def test_stft(self, example_patch):
"""Time short time fourier transform transform."""
patch = example_patch
patch.stft(time=1, overlap=0.25)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix inconsistent docstring.

The docstring mentions "Time short time fourier transform transform" which is redundant and unclear.

-        """Time short time fourier transform transform."""
+        """Time STFT (short-time Fourier transform)."""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_stft(self, example_patch):
"""Time short time fourier transform transform."""
patch = example_patch
patch.stft(time=1, overlap=0.25)
def test_stft(self, example_patch):
"""Time STFT (short-time Fourier transform)."""
patch = example_patch
patch.stft(time=1, overlap=0.25)
🤖 Prompt for AI Agents
In benchmarks/test_patch_benchmarks.py around lines 178 to 181, the test_stft
docstring is inconsistent and redundant ("Time short time fourier transform
transform"); replace it with a concise, correct description such as "Short-time
Fourier transform" or "Time short-time Fourier transform" to remove duplication
and clarify intent.

Comment on lines 111 to 118
@contextmanager
def skip_timeout():
"""Skip if downloading file times out."""
try:
yield
except TimeoutError as exc:
pytest.skip(f"Unable to fetch data due to timeout: {exc}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Confirm which timeout exception types fetch can raise; built-in TimeoutError may be insufficient.

If fetcher uses requests/urllib/pooch, timeouts may surface as requests.exceptions.Timeout, urllib.error.URLError, or socket.timeout and won’t be caught.

Run to inspect fetcher implementation and exceptions:

If needed, extend skip_timeout to catch the actual timeout exceptions raised by fetcher.


🏁 Script executed:

#!/bin/bash
rg -n -C3 '\bfetch\(' dascore | sed -n '1,200p'
rg -n -C3 '\bfetcher\b' dascore
rg -n -C3 'TimeoutError|socket\.timeout|requests\.exceptions\.Timeout|urllib\.(error|request)' dascore

Length of output: 7648


🏁 Script executed:

#!/bin/bash
sed -n '90,140p' tests/test_io/test_common_io.py

Length of output: 1373


Extend skip_timeout to catch actual fetcher timeouts
The current skip_timeout only catches built-in TimeoutError, but pooch.fetch may raise socket.timeout, urllib.error.URLError (wrapping socket.timeouts), or requests.exceptions.Timeout. Update the except clause to catch these types so tests skip on real fetch timeouts.

🤖 Prompt for AI Agents
In tests/test_io/test_common_io.py around lines 111 to 118, skip_timeout
currently only catches built-in TimeoutError but must also catch actual fetcher
timeout exceptions (socket.timeout, urllib.error.URLError which can wrap
socket.timeouts, and requests.exceptions.Timeout); update the except clause to
catch a tuple of these exception types (e.g., except (TimeoutError,
socket.timeout, urllib.error.URLError, requests.exceptions.Timeout) as exc), add
the required imports at the top of the file (import socket; from urllib import
error as urllib_error or import urllib.error; import requests.exceptions), and
keep the pytest.skip message using the captured exc.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/test_io/test_common_io.py (2)

292-301: Fix: stream-read test always skipped and reads wrong variable.

  • isinstance used on a type; condition always True → test skipped.
  • BytesIO filled from Path(io_path_tuple) instead of Path(path).

Apply this diff:

-        req_type = getattr(io.read, "_required_type", None)
-        if not isinstance(req_type, BinaryReader):
-            msg = f"{io} doesn't require a read type."
-            pytest.skip(msg)
+        req_type = getattr(io.read, "_required_type", None)
+        if req_type is not BinaryReader:
+            pytest.skip(f"{io} doesn't support BinaryReader streams.")
@@
-        bio = BytesIO()
-        bio.write(Path(io_path_tuple).read_bytes())
+        bio = BytesIO()
+        bio.write(Path(path).read_bytes())
         bio.seek(0)

448-455: Fix self-comparison in scan vs patch attrs test.

Currently compares pat_attrs1 to itself, not scan_attrs2; test can’t detect mismatches.

Apply this diff:

-                assert getattr(pat_attrs1, f"{dim}_min") == getattr(
-                    pat_attrs1, f"{dim}_min"
-                )
+                assert getattr(pat_attrs1, f"{dim}_min") == getattr(
+                    scan_attrs2, f"{dim}_min"
+                )
                 for dim_attr in DIM_RELATED_ATTRS:
                     attr_name = dim_attr.format(dim=dim)
                     attr1 = getattr(pat_attrs1, attr_name)
-                    attr2 = getattr(pat_attrs1, attr_name)
+                    attr2 = getattr(scan_attrs2, attr_name)
                     assert attr1 == attr2
🧹 Nitpick comments (1)
tests/test_io/test_common_io.py (1)

70-87: Deduplicate entries in COMMON_IO_READ_TESTS (optional).

DASDAEV1 and Terra15FormatterV6 appear twice, creating redundant parametrizations and longer CI.

Apply this diff to remove duplicates:

@@
-    DASDAEV1(): ("example_dasdae_event_1.h5",),
@@
-    Terra15FormatterV6(): ("terra15_v6_test_file.hdf5",),

Please confirm duplicates weren’t intentional to cover variant files.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b410675 and e4e3457.

📒 Files selected for processing (1)
  • tests/test_io/test_common_io.py (10 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_io/test_common_io.py (2)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
dascore/utils/downloader.py (1)
  • fetch (40-58)
🔇 Additional comments (13)
tests/test_io/test_common_io.py (13)

18-18: LGTM on urllib error import.

Used in skip_timeout; good addition.


112-119: Broaden timeout handling to avoid flaky network-dependent tests.

pooch/urllib/requests timeouts can surface as socket.timeout or requests.exceptions.Timeout and won’t be caught. Use a consolidated TIMEOUT_EXCS tuple and catch that.

Apply this diff to skip_timeout:

 @contextmanager
 def skip_timeout():
     """Skip if downloading file times out."""
     try:
         yield
-    except (TimeoutError, urllib_error.URLError) as exc:
+    except TIMEOUT_EXCS as exc:
         pytest.skip(f"Unable to fetch data due to timeout: {exc}")

Add the following near the imports (top of file, after Line 18) to define TIMEOUT_EXCS:

+import socket
+try:
+    from requests import exceptions as requests_exceptions  # type: ignore
+except Exception:
+    requests_exceptions = None
+if requests_exceptions is not None:
+    TIMEOUT_EXCS = (TimeoutError, urllib_error.URLError, socket.timeout, requests_exceptions.Timeout)
+else:
+    TIMEOUT_EXCS = (TimeoutError, urllib_error.URLError, socket.timeout)

101-110: LGTM: skip_missing() behavior.

Catches MissingOptionalDependencyError and skips cleanly. Timeout catch here is acceptable given usage.


131-133: LGTM: Use skip_missing around cached read.

Appropriate to avoid hard failures when optional deps are absent.


158-160: LGTM: Guard fetch with skip_timeout.

Prevents spurious failures on registry timeouts.


169-171: LGTM: Fetch guarded by skip_timeout.

Good for session-scoped data setup.


176-177: LGTM: Read spool guarded by skip_missing.

Matches intended optional-deps behavior.


184-186: LGTM: Scan guarded by skip_missing.

Consistent with read_spool fixture.


259-261: LGTM: Timeout-safe fetch for cross-format negative test.

Avoids cascading failures when registry is slow.


311-312: LGTM: Use skip_missing around scan.

Prevents hard failures on missing deps.


359-361: LGTM: scan guarded by skip_missing.

Consistent with fixture usage.


370-372: LGTM: io.scan wrapped with skip_missing.

Keeps the test robust to optional deps.


440-441: LGTM: Integration scan guarded by skip_missing.

Good for cross-checking attrs reliably.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/test_io/test_common_io.py (1)

294-300: Bug: Using io_path_tuple instead of path when reading file bytes.

This will fail since io_path_tuple is a (io, path) tuple.

Apply:

         # write file contents to bytes io and ensure it can be read.
         bio = BytesIO()
-        bio.write(Path(io_path_tuple).read_bytes())
+        bio.write(Path(path).read_bytes())
         bio.seek(0)
         spool2 = io.read(bio)
🧹 Nitpick comments (2)
tests/test_io/test_common_io.py (2)

18-18: Augment imports for robust timeout handling.

To fully cover common timeout exceptions in skip_timeout, add imports for socket and an optional requests.exceptions import.

Add near the imports:

import socket
try:
    from requests import exceptions as _requests_exceptions  # type: ignore
except Exception:  # requests may be optional in test envs
    _requests_exceptions = None

99-108: Keep skip_missing focused; remove TimeoutError handling here.

Timeouts are a transport concern handled by skip_timeout. Keeping them here duplicates behavior and blurs responsibility.

Apply:

 @contextmanager
 def skip_missing():
     """Skip if missing dependencies found."""
     try:
         yield
     except MissingOptionalDependencyError as exc:
         pytest.skip(f"Missing optional dependency required to read file: {exc}")
-    except TimeoutError as exc:
-        pytest.skip(f"Unable to fetch data due to timeout: {exc}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4e3457 and eaa4412.

📒 Files selected for processing (1)
  • tests/test_io/test_common_io.py (10 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_io/test_common_io.py (2)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
dascore/utils/downloader.py (1)
  • fetch (40-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_build_docs
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: Run benchmarks
🔇 Additional comments (3)
tests/test_io/test_common_io.py (3)

129-131: LGTM: IO/scan operations correctly wrapped with skip_missing.

This centralizes skipping for optional deps and reduces boilerplate.

Also applies to: 174-176, 182-184, 309-311, 357-359, 368-370, 438-440


156-158: LGTM: Data fetches wrapped with skip_timeout.

Appropriate use to avoid flaky network-related failures.

Also applies to: 167-169, 257-259


110-117: Extend skip_timeout to catch all relevant timeout exceptions.

Current handling misses socket.timeout and requests.exceptions.Timeout, so real-world fetch timeouts may fail tests instead of skipping. Define a tuple and use it in the except.

Apply:

 @contextmanager
 def skip_timeout():
     """Skip if downloading file times out."""
     try:
         yield
-    except (TimeoutError, urllib_error.URLError) as exc:
+    except TIMEOUT_EXCS as exc:
         pytest.skip(f"Unable to fetch data due to timeout: {exc}")

Add (outside this block, near imports):

# Compose timeout exception tuple safely (requests is optional).
TIMEOUT_EXCS = (TimeoutError, urllib_error.URLError, socket.timeout)
if _requests_exceptions is not None:
    TIMEOUT_EXCS = TIMEOUT_EXCS + (_requests_exceptions.Timeout,)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/test_io/test_common_io.py (2)

291-293: Make BinaryReader gate more robust (allow subclasses).

Some IOs may declare a subclass of BinaryReader; use issubclass with a guard.

Apply this diff:

-        if req_type is not BinaryReader:
-            pytest.skip(f"{io} doesn't support BinaryReader streams.")
+        try:
+            supports_stream = req_type is not None and issubclass(req_type, BinaryReader)
+        except TypeError:
+            supports_stream = False
+        if not supports_stream:
+            pytest.skip(f"{io} doesn't support BinaryReader streams.")

300-305: Also skip on NotImplementedError for BytesIO reads.

Some implementations raise NotImplementedError for stream inputs.

Apply this diff:

-        except (AttributeError, OSError, UnsupportedOperation) as e:
+        except (AttributeError, OSError, UnsupportedOperation, NotImplementedError) as e:
             # Skip if the format doesn't support BytesIO (e.g., missing
             # 'name' attribute, fileno() not supported, or other BytesIO
             # incompatibilities)
             pytest.skip(f"{io} doesn't support BytesIO streams: {e}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eaa4412 and 07df64b.

📒 Files selected for processing (1)
  • tests/test_io/test_common_io.py (12 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_io/test_common_io.py (4)
dascore/exceptions.py (1)
  • MissingOptionalDependencyError (98-99)
dascore/utils/downloader.py (1)
  • fetch (40-58)
tests/test_io/test_io_core.py (3)
  • read (48-49)
  • read (67-69)
  • read (91-94)
dascore/utils/io.py (1)
  • BinaryReader (33-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test_code (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_build_docs
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: Run benchmarks
🔇 Additional comments (3)
tests/test_io/test_common_io.py (3)

459-465: Confirm isinstance with PEP 604 unions is within supported Python versions.

This relies on isinstance(obj, float | np.floating), which is valid on modern Python but not older versions. If you still support older versions, switch to a tuple: (float, np.floating).

Would you like me to convert these to tuples throughout tests for broader compatibility?


18-20: Add safe imports for timeout types (socket, requests) used in skip_timeout.

Prevents NameError in the except tuple and ensures broader timeout coverage.

Apply this diff to add robust imports:

 from urllib import error as urllib_error

+import socket
+try:
+    from requests.exceptions import Timeout as RequestsTimeout
+except Exception:  # requests may not be installed in all envs
+    class RequestsTimeout(Exception):
+        pass

110-117: Broaden skip_timeout to catch actual network timeout exceptions.

Include socket.timeout and requests’ Timeout so real fetch timeouts are skipped instead of failing CI.

Apply this diff:

 @contextmanager
 def skip_timeout():
     """Skip if downloading file times out."""
     try:
         yield
-    except (TimeoutError, urllib_error.URLError) as exc:
+    except (TimeoutError, socket.timeout, urllib_error.URLError, RequestsTimeout) as exc:
         pytest.skip(f"Unable to fetch data due to timeout: {exc}")

@d-chambers d-chambers merged commit 018d807 into master Sep 26, 2025
24 checks passed
@d-chambers d-chambers deleted the codspeed branch September 26, 2025 14:57
@coderabbitai coderabbitai bot mentioned this pull request Sep 27, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready_for_review PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants