Skip to content

Conversation

@ahmadtourei
Copy link
Collaborator

@ahmadtourei ahmadtourei commented Jan 13, 2025

Description

This PR fixes a bug in the Patch.correlate_shift() function. With the previous version of this function, there was an AssertionError as mentioned below:

import dascore as dc

patch = dc.get_example_patch("random_das", shape=(5, 50))

corr_patch = patch.correlate(distance=0, samples=True)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[28], [line 1](vscode-notebook-cell:?execution_count=28&line=1)
----> [1](vscode-notebook-cell:?execution_count=28&line=1) corr_patch = patch.correlate(distance=0, samples=True)

File ~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:260, in patch_function.<locals>._wrapper.<locals>._func(patch, *args, **kwargs)
    [254](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:254) check_patch_coords(
    [255](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:255)     patch,
    [256](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:256)     dims=required_dims,
    [257](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:257)     coords=required_coords,
    [258](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:258) )
    [259](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:259) check_patch_attrs(patch, required_attrs)
--> [260](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:260) out: PatchType = func(patch, *args, **kwargs)
    [261](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:261) # attach history string. Need to consider something a bit less hacky.
    [262](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/utils/patch.py:262) if out is not patch and hasattr(out, "attrs"):

File ~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:217, in correlate(patch, samples, lag, **kwargs)
    [215](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:215) if not input_dft:
    [216](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:216)     idft = out.idft.func(out)
--> [217](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:217)     out = idft.correlate_shift.func(idft, fft_dim)
    [218](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:218) return out

File ~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:77, in correlate_shift(patch, dim, undo_weighting)
     [75](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:75) new_end = np.ceil((len(coord) - 1) / 2) * step
     [76](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:76) new_coord = dc.get_coord(start=new_start, stop=new_end, step=step)
---> [77](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:77) assert len(new_coord) == len(coord)
     [78](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:78) cm = patch.coords
     [79](https://file+.vscode-resource.vscode-cdn.net/Users/tourei%40mines.edu/coding/caserm_test/~/miniconda3/envs/py10/lib/python3.10/site-packages/dascore/proc/correlate.py:79) new_cm = cm.update(**{dim: new_coord}).rename_coord(**{dim: f"lag_{dim}"})

AssertionError:

len(coord) is 99 while len(new_coord) is 98.

I'm not sure why other random patches with even time coord length, such as (5, 40) or (5, 70), did not raise errors though.

Checklist

I have (if applicable):

  • referenced the GitHub issue this PR closes.
  • documented the new feature with docstrings or appropriate doc page.
  • included a test. See testing guidelines.
  • your name has been added to the contributors page (docs/contributors.md).
  • added the "ready_for_review" tag once the PR is ready to be reviewed.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced documentation for contributing to the project
    • Improved handling of coordinate lengths in correlation calculations
  • Bug Fixes

    • Fixed coordinate handling in correlation shift calculations for odd-length coordinates
  • Tests

    • Added new test cases for correlation with odd and even coordinate lengths
    • Introduced new pytest fixtures for generating random patches with odd and even lengths
  • Documentation

    • Updated documentation for the notch filter function with improved examples and references

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2025

Walkthrough

This pull request introduces modifications to the correlate_shift function in the DASCore library to handle coordinate arrays with odd lengths. The changes include updating the dascore/proc/correlate.py module with a new conditional check, adding corresponding test fixtures and test methods in tests/test_proc/test_correlate.py, and updating the contribution documentation in docs/recipes/how_to_contribute.qmd. The modifications aim to improve the robustness of coordinate handling during correlation operations.

Changes

File Change Summary
dascore/proc/correlate.py Added conditional check to handle odd-length coordinate arrays by adjusting new_end calculation and updated new_coord to retain units.
tests/test_proc/test_correlate.py Added new pytest fixtures random_patch_odd and random_patch_even, and corresponding test methods test_auto_correlation_odd_coord and test_auto_correlation_even_coord.
docs/recipes/how_to_contribute.qmd Streamlined contribution instructions for DASDAE developers and users, simplified directory change steps, and added instructions for setting up the upstream repository.
.github/workflows/run_min_dep_tests.yml Modified workflow to install dascore in editable mode, replaced uv run with a custom shell script for test execution.
dascore/proc/filter.py Updated documentation for notch_filter function, replacing "Notes" with "See Also" and changing example q parameter values.

Possibly related PRs

  • minor fix for LF proc #484: Changes in the main PR to the correlate_shift function may relate to the adjustments in chunk size calculations in PR minor fix for LF proc #484, as both involve handling data processing logic, although they target different aspects of the DASCore functionality.

Suggested reviewers

  • d-chambers

Poem

🐰 Correlate with grace, oh code so bright,
Odd or even, we'll set things right!
Coordinates dance, no length too strange,
With DASCore's magic, we'll rearrange!
A rabbit's leap through data's domain! 🌈


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2584160 and e84bf0d.

📒 Files selected for processing (2)
  • dascore/proc/correlate.py (1 hunks)
  • dascore/proc/filter.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • dascore/proc/correlate.py
⏰ Context from checks skipped due to timeout of 90000ms (15)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
🔇 Additional comments (3)
dascore/proc/filter.py (3)

274-277: LGTM! Documentation improvement.

The addition of the "See Also" section with a link to scipy's iirnotch documentation is helpful for users who need more detailed information about the underlying implementation.


Line range hint 274-291: Verify relation to PR objectives.

These documentation changes to the notch_filter function appear unrelated to the PR's main objective of fixing the correlate_shift bug. Please clarify if these changes should be in a separate PR to maintain focused code reviews.


288-288: Verify the Q value change impact.

The Q value in the examples has been increased from 10 to 30, which creates a narrower notch filter. While this change might provide better frequency precision, we should verify if this is the recommended value for typical use cases.

Also applies to: 291-291

✅ Verification successful

Q value change is appropriate and well-supported

The Q value of 30 is consistent with the test suite and falls within the demonstrated working range (20-40) across the codebase. This value represents a good balance between frequency precision and filter width.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other Q value usages in tests and examples
rg -A 2 "notch_filter.*q\s*=" .

Length of output: 2838

Finishing Touches

  • 📝 Generate Docstrings (Beta)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@codecov
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.85%. Comparing base (1dc1ce4) to head (e84bf0d).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #485   +/-   ##
=======================================
  Coverage   99.85%   99.85%           
=======================================
  Files         118      118           
  Lines        9702     9703    +1     
=======================================
+ Hits         9688     9689    +1     
  Misses         14       14           
Flag Coverage Δ
unittests 99.85% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
dascore/proc/correlate.py (1)

76-77: LGTM! The fix correctly handles odd-length coordinates.

The added condition ensures that the new coordinate range maintains the same length as the input coordinate array by adding an extra step when the length is odd.

Consider adding docstring examples demonstrating both odd and even length cases to help users understand the behavior:

>>> # Example with odd length
>>> patch = dc.get_example_patch(shape=(2, 11))
>>> dft = patch.dft("time")
>>> dft_sq = dft * dft.conj()
>>> idft = dft_sq.idft()
>>> auto_patch = idft.correlate_shift(dim="time")
>>>
>>> # Example with even length
>>> patch = dc.get_example_patch(shape=(2, 10))
>>> # ... same steps as above
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1dc1ce4 and 22499e1.

📒 Files selected for processing (3)
  • dascore/proc/correlate.py (1 hunks)
  • docs/recipes/how_to_contribute.qmd (0 hunks)
  • tests/test_proc/test_correlate.py (2 hunks)
💤 Files with no reviewable changes (1)
  • docs/recipes/how_to_contribute.qmd
⏰ Context from checks skipped due to timeout of 90000ms (15)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
🔇 Additional comments (1)
tests/test_proc/test_correlate.py (1)

15-25: LGTM! Well-structured fixtures for testing both cases.

The fixtures provide good test data with explicit odd (11) and even (10) lengths.

Comment on lines +41 to +55
def test_auto_correlation_odd_coord(self, random_patch_odd):
"""Ensure correlate_shift works when dim's coord length is odd."""
dft = random_patch_odd.dft(dim="time")
dft_conj = dft.conj()
dft_sq = dft * dft_conj
idft = dft_sq.idft()
assert isinstance(idft.correlate_shift(dim="time"), dc.Patch)

def test_auto_correlation_even_coord(self, random_patch_even):
"""Ensure correlate_shift works when dim's coord length is even."""
dft = random_patch_even.dft(dim="time")
dft_conj = dft.conj()
dft_sq = dft * dft_conj
idft = dft_sq.idft()
assert isinstance(idft.correlate_shift(dim="time"), dc.Patch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Strengthen the test assertions.

While testing instance type is good, consider adding assertions to verify:

  1. The coordinate lengths match the input
  2. The correlation results are as expected

Add these assertions to both test methods:

 def test_auto_correlation_odd_coord(self, random_patch_odd):
     """Ensure correlate_shift works when dim's coord length is odd."""
     dft = random_patch_odd.dft(dim="time")
     dft_conj = dft.conj()
     dft_sq = dft * dft_conj
     idft = dft_sq.idft()
-    assert isinstance(idft.correlate_shift(dim="time"), dc.Patch)
+    result = idft.correlate_shift(dim="time")
+    assert isinstance(result, dc.Patch)
+    # Verify coordinate lengths match
+    assert len(result.get_coord("lag_time")) == len(random_patch_odd.get_coord("time"))
+    # Verify maximum correlation at zero lag
+    assert np.argmax(result.data) == len(result.get_coord("lag_time")) // 2

Apply similar changes to test_auto_correlation_even_coord.

Committable suggestion skipped: line range outside the PR's diff.

@ahmadtourei ahmadtourei added ready_for_review PR is ready for review proc Related to processing module labels Jan 13, 2025
@d-chambers
Copy link
Contributor

Hey @ahmadtourei, I am not sure why the ci is failing, I will look into it today or tomorrow.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/run_min_dep_tests.yml (1)

Line range hint 1-57: Consider adjusting the timeout for minimum dependency tests.

The current timeout of 25 minutes might need adjustment when running with minimum dependencies, as package resolution and installation might take longer.

Consider increasing the timeout-minutes value or monitoring job execution times to ensure it's sufficient:

-    timeout-minutes: 25
+    timeout-minutes: 30
🧰 Tools
🪛 actionlint (1.7.4)

47-47: shellcheck reported issue in this script: SC2102:info:1:17: Ranges can only match single chars (mentioned due to duplicates)

(shellcheck)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22499e1 and 2584160.

📒 Files selected for processing (1)
  • .github/workflows/run_min_dep_tests.yml (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/run_min_dep_tests.yml

47-47: shellcheck reported issue in this script: SC2102:info:1:17: Ranges can only match single chars (mentioned due to duplicates)

(shellcheck)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
🔇 Additional comments (2)
.github/workflows/run_min_dep_tests.yml (2)

45-47: LGTM! Package installation step looks good.

The change to use pip for installation with test dependencies is appropriate and follows standard practices.

🧰 Tools
🪛 actionlint (1.7.4)

47-47: shellcheck reported issue in this script: SC2102:info:1:17: Ranges can only match single chars (mentioned due to duplicates)

(shellcheck)


51-52: Verify the shell configuration and test script.

The shell configuration has been updated to use bash -el {0} which is good for error handling, but we should verify the test script exists and has proper permissions.

Run this script to verify the test configuration:

Also applies to: 56-57

✅ Verification successful

Shell configuration and test script verification successful

The test script exists with proper executable permissions (755) and passes basic syntax checks. There's a minor ShellCheck suggestion about quoting variables, but it doesn't impact the core functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test script existence and permissions

# Check if test script exists and is executable
if [ -f ./.github/test_code.sh ]; then
  echo "Test script exists"
  ls -l ./.github/test_code.sh
else
  echo "Test script not found"
  exit 1
fi

# Check shell script syntax
bash -n ./.github/test_code.sh

# Check for common shell script issues
if command -v shellcheck &> /dev/null; then
  shellcheck ./.github/test_code.sh
fi

Length of output: 604

Copy link
Contributor

@d-chambers d-chambers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking over this, I think we forgot to propagate the units.

We might modify this slightly to remove the if clause and handle the units like so:

_new_coord = dc.get_coord(start=new_start, stop=new_end, step=step, units=coord.units)
new_coord = _new_coord.change_length(len(coord))

Up to you though.

@d-chambers
Copy link
Contributor

For the record, I am not sure why uv was seg faulting but I fixed the minimum dependency tests by simply using pip to install dascore and then it worked fine.

@ahmadtourei ahmadtourei merged commit 74c0d16 into master Jan 16, 2025
17 checks passed
@ahmadtourei ahmadtourei deleted the fix_correlate_coord_issue branch January 16, 2025 18:48
@ahmadtourei ahmadtourei restored the fix_correlate_coord_issue branch January 16, 2025 19:49
@coderabbitai coderabbitai bot mentioned this pull request Jan 23, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proc Related to processing module ready_for_review PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants